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PREFACE 



Sex discrimination has been prohibited by law In educational Institu- 
tions that receive federal support. Yet, there Is still concern that . 
many educational policies, procedures, and practices reinforce sex- 
role stereotypes and reflect prejudiced views of women's achievements, 
abilities, and Interests, Educational tests have been exartilned for 
bias against women by a variety of procedures (Tittle, 109), as tests 
In generat have been examined for bias against minority grou(>s 
(Flaugher, 33)- The variety of procedures used in studies Indicates 
that there Is not one- procedure or definition that will quicicly tell 
the policy malcer, test user, pr test publisher that an educational 
test is or rs not biased against women. Rather, there are currently 
b^ing developed a series of guidelines encompassing procedures that 
will need to be considered to permit the statement that a test and Nts 
use are sex-fair. Or, said in another way, that the test and the .test 
use are as free from sex bias as it is possible to determine, given 
the present state of the art In identifying aspects of discrimination 
and in the field of testing. This monograph reflects the state of the 
art in both testing and analyses related to edu|:ational equity. Both 
provide sufficient guidance to Inform policy. Improve educational 
practice, and provide sex-fair instruments and test use. 
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INTRODUCTION 



WHY EXAMINE SEX BIAS IN EDUCATIONAL 
TESTING? 

Educational testing Is a common experience 
In the school life of American children. 
Testing Is carried out at all levels of 
the cduc«itlondl system. . From a concern at 
the national level for the status of basic 
knowledge of American students and adults, 
as monitored by the National Assess»T»cnt 
for Educat lonal Progress , to the state- 
level testing of minimum competencies for 
graduation or assessment of statewide , 
progress In education, to th< use of tests 
in local school systems for assessing and 
diagnosing student progress, most school 
children and the majority of adults today 
will have had the experience of taking an, 
educational test. Tests are also wMely 
used In evaluations of federal programs 
that allocate funds to the states for var* 
lous^ educational programs. 

Other major areas of test use are In the 
selection of students for college admis- 
sions and In career guidance and counsel- 
ing. Tests are also used in the selection 
of students ^for occupational programs. 
School systems and states also use tests 
to certify teachers:^ and to select em' 
ployees. An est/imate of the types of 
tests administered annually was made by 
Holman and Docter (51). They listed three 
main areas of testing programs and their 
proportion of total test use: l) educa- 
tional achievement testing (65 percent); 
2) testing for selection and placement ^ - 
(30 percent); and 3) testing in counseling, 
guidance, and clinical work (5 percetfit of 
tests used). Some indication of. the>vol- 
ume of tests administered is found In ^ 
recent estimate of 3.5 million Interest 
inventories scored annually by major test 
^^ring services (111). Holmen and Docter 
estimated that 200 mi lllon achievement rest 
forms answer sheets were used annual ly 

In the United States as of 1972. ^ 

In higher education., where tests are used 
in admission procedures > other forms of ed^ 
ucational testing will also become more 
prominent. The City University of New 
York, for example, has recently instituted 



a Policy of assessing student competence 
In writing and reading at the transition 
point between the sophomore and junior 
y^ars. This policy will affect about 
150,000 students. Educational testing 
then, from the kindergarten and first- 
grade level of Title I testing, through 
testing for minimum competency standards 
for high school graduation, to college 
entrance and minimum competencies for 
transition points In college, to the use 
of interest measures to assist career 
choice. Is widespread throughout the 
American educational system. 

The increasing numbers of tests being 
developed and the large numbers of students 
tested, half of whom will be women, has 
made educational testing a subject of ex- 
amination by those concerned with equality 
of opportunity for women and educational 
equity. Since the test content becomes a 
part of the school's materials, just as 
textbooks and beginning readers are, they 
help to form the view that students have of 
•^themse 1 ves . In particular, they help to 
reinforce and illustrate th^ views our 
culture holds of appropriate roles for men 
and women. These roles for women and men 
are conveyed in a number of spheres: in 
the home, in school, in male-female inter- 
actions, In chiVd rearing, and in occupa- 
tional settings. 

in addition to the fact that tests are 
part of the educational setting for stu- 
dents, they are important In another way. 
This other evidence of their importance 
is found in the effects of. tests on stu- 
dents, parents, and teachers. The Russell 
Sage Foundat ion has funded a series of 
studies that have examined attitudes held 
about intelligence tests and teachers' 
views of thei r preparation for under stand- 
ing tests. In a survey about American be- 
liefs and attitudes about Intelligence, 
Brim^et al. (7) fpund that almost 
80 percent of publ \c school students be- 
li eve— that i n te f1 i g^nce tes-ts are somewhat 
or v-ery accur'atp. There is some evidence 
that the;se vjeKS are held genera 1.1 y and 
extend to the areas of college entr^ce 



testing and tho assessment of school 
achievement (61 ) . 

Coslln (35) conducted a survey examining 
teacher attitudes toward standardized 
tests. He found that they tend to view 
standardized tests as relatively accurate 
measures of a student's Intellectual 
potential and achievement; that teachers 
see the kinds of abilities measured by 
standardized tests as Important determin- 
ants of the subsequent academic success of 
children; and that they believe that con- 
siderable weight should be given to test 
scores, along with school grades. In making 
decisions about special classes, college 
admissions, and so on. There Is an Indica- 
tion of an Internal consistency in the be- 
lief systems of some teachers concerning 
tests and their use. Teachers who ex- 
pressed confidence In the accuracy of 
standardized tests also felt that they 
measured the qualities necessary for 
success; they also believed that the abil- 
ities measured were, to a significant 
degree. Innate, rather than learned. 
These teachers tended to have had more con- 
tact with tests and more formal training 
In psychomet r ics . 

FEDER7J. LAWS AND REGULATIONS 

In addition to the widespread use of tests, 
their p»lace in the context of education 
for students, and the likely beliefs of 
their value held by teachers, there are 
othervt cur rent reasons for ex?nining 
educational tests for educational equity 
for women and educational policy. Federal 
law now provides regulatory and legal 
pressures for equity and fairness In test- 
ing. Three examples are Title IX, the 
Uniform Guidelines on ^mp 1 oyee Se 1 ec t i on 
Procedures , and the Vocational Education 
Act of the Education Amendments of 1976. 
Title IX, In the Education Amendments of 
1972, has the regulations that most 
widely affect all levels of educat lona 1 ^ 
practice. Title IX prohibits discrimina- 
tion on the basis of sex against most 
adult's employed In educational settings 
and nx>st students. The same benefits and 
opportunities for job advancement are to 
beC^offered m^n and women, and boys and 
Girls are to receive the same instruction 
and treatment without regard to their 
gender. Although curriculum materials are' 
excluded from Title IX, the use of tests 
and counseling are not. Tests and mater- 
ials used by counselors and teachers in 



guidance must be nondiscriminatory. If 
this is broadly Interpreted, achievement, 
aptitude, and interest tests all fall with- 
in the purview of Title IX. 

Schlffer (9S) has described the legal regu- 
lations that are related to selecting and 
using Interest inventories under Title IX 
and the ]kth Amendment to the Constitution 
(the Equal Protection Clause). Title IX 
of thie Education Amendments Act of 1972 has 
specific requirements for eliminating bias 
in test use and counseling for every school 
that, receive^ federa] funds. Title IX 
provides: 

No person In the United States shall, 
on the basis of sex, be excluded from 
^participation In, be denied the bene- 
fits of, or be subjected to discrim- 
ination uhder any education program 
or activity receiving federal f Inan- * 
c la 1 ass I stance .... 

One section of the regulation is particu^ 
larly relevant to schools in using Interest 
inventories and counseling for career 
selection. This is ^5 C.F.R. $86.36: 
Counseling and Use of Appraisal and Coun- 
seling Hater la 1 s . . 

(a.) Counseling. A recipient shall 
not discriminate against any^erson 
on the basis of sex In counseling 
or guidance of rtudents or applicants 
for admission. (b.) Use of appraisal 
and counseling materials. A recipient 
which uses testing or other materials 
for appraising or counseling students 
shall not' use different materials for 
students on the basis of their sex 
or use niaterlajs which permit or 
require different treatment of 
students on such basis unjess such 
different materials cover the san^ 
occupations and Interest areas and 
the use of such diffe rent *n>ate rial s 
is shown to be essential to^ellminate 
sex bias. Recipients shall develop 
*and use interna 1 procedures for 
ensuring that such materials do' not 

\ discrlmate on the basis of sex. 
\ Where the use of counsel ing, test 
>pr otr)er instruments resul ts in a 
substantially disproportionate 
number of- members of one sex In 
any particular course of study or 
c lass i f i cat ion, the reel p lent sha 1 1 

/ take sucTi action as ts necessary to 
assure Itself that s^pch disproportion 
is not the result of d t^cr imi nat Ion 
In the instrument or the applications. 



similar ld«as *ppe«r In the Title IX 
reguIat(on. regarding d I ,cr Inuit Ion on the 
ba5l. of sex In admission and recruitment 

o? Tr?::'i^,r?'''":°- ^hi, section 

or Title IX Include the following: 

(I) In determining whether a person 
satisfies any policy or crlter- 
inC •^"'•"'on. or In making" 

'«nt to which this subpart 
applies shall not: (I J gUe 
preference to one person over 
another on the basis of ,ex. by 
ranking applicants separately 
on such basis, or otherwise; 
yi.) «pply numerical limita- 
tions upon the number or pro- 
portion of persons of either 
sex who may be admitted; or 
VM .) otherwise treat one 
Individual differently from 
another on the basis of sex. 

(2) A recipient shall not admin- 
ister or operate any test or 
other criterion for admIssio>i 
which has a disproportionately 
advWse effect on persons on 
the basis of sex unless the 
use of such test or criterion 
Is shown to predict Val idly 
success In the education pro- 
gram or activity in question 
and alternative tests or cri- 
teria which do not have such a 
drsproportlonately adverse 

^rr**^' '° unavail- 

ab.e^^^(pderai_Re^^ 197^. " 

These regulations are similar to the rebu- 

'^Tl 'c f'^ j^Ul l"^ Gu'delines nn 
oyee Selection ProceH...^c i..... .^ 

SSa^lter. Ofccember 20. 1^77)7 ThTu^TTform 

' Ctrlr" - consensus am^^^i^, 

edera I agend.es: the Civil Service 

tTc^^"'- Employment Ppportun- 

nH ^ST'"'"". the Department of Justice 

grm-Gu.del.nes . the definition of dife7im- 
nat.on hat fs" used with regard to empfol 
nt dec.s.ons Is also on thi basis of 

»ad ; deffn^H ^I^£ ^^"t-decisions .... 
»ad I y defined; th^ include, but are 
wt I *mi,ted to h'lr-.n,^ » cai,e 

' Z "'ring, promot on. demo- 

.^"ni^^ o^T'^e e'r^.r^'''- .'"^ 

?e%"^^on%:c"^^^'^''--- ^ 
P^^"^J^^e%--d%- 

o : 
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decisions DJjcrUIn-^ |, defined a, 

of any se lect [procedure thaJ 
has an adverse Impact on the hiring 

bIr'ISf ' °' employ„«,nt or mem- 

X rl/l J^"" -PP-f"nltles of members of anr 
racial, ethnic, or sex group unless the 

w uSlhTc In accoJdanca 

h'J? Guidelines. Adverse Imoa rf I, 

defined as a se le.;t Ion rate'or any - V aclal 

four nV?: ^^ft Jhan • 

four fifth, (80 percent) of the rate for 
the group with the highest rate. This 

b^ tSe ^^d"'? "'1' 9-.ralIy regarded 
eildi enforcement agencies as 

evidence of adverse Impact. SmalleKdlf- 

aSve%"srim:a5: '''^ constitute 

?n h ''"P'f'- they are significant 

U fs Jnter' P-ctlcal'terms ' 

It is Interesting to note that selectinn 
.E-Sldure is very broadly deffnlf^^^,. 

the f^'n " Including 

frL r .'iT" °^ -""sment technique^ 
from traditional paper.and pencil tests 
performance tests, training programs or 

tio°n:f 'V""''' ^"'^ pVslcareduca- 

tional. and work experience requirements 
to .nformal or casual interviews anT ' 
unscored application forms. 



the 

sets 

all 



The Education Amendments of J976 in 
Vocational Education Act (V^A) . ilso 
forth the policy of equal access for 
minorities >and women to programs Unde^ 
the legislation ^d' requires states to * 
describe the specific actions taken to ' 
overcome sex discrimination. ^tates ar'e 
also to specifythe I ncent I vtW adopted to 
?n nr?'^'"r°''''^"t Of both women and ,^n 
-n npntrad.tional courses of studj-. St^^e 
Plans are to include model programs deve ! 
oped to reduce sex bias and sex stereo- 

o" programs and placement" 

in an occupations {Section 10if.l87 
Federal Re^]^t.r^ Monday. October 3 iq77)^ 
•■ederally fund^ programs . Is we I as ^he^' 
programs of local districts mn!r h 
cerne, with e , iminattn'g 'd fs^crTm"? a Uon^"" 
based on sex and wl-th providing sex-fair 
counse nng -d gu,We.activi?ie:and 
"«terials, incjuding educational tests. 

"^STING ■■'^'''^^^^^^ EVALUATE EQUITY 



bfL fnH H conc'erned with sex 

bias and discrimination provide general 
V.nciples. A few regulations are specific 
to sex bias m educational 
the use of 



test ing , e.g., 
career- interest inventories 'in 
counseling an< emp 1 oymen t' tes t ingv How- 



rVPf . thrfc K not pirsrfilly i (>fi«»rr)Mjs on 
the* full %rt of f)ruLrdurrs ih.it wt)u Ul 
ile-tlnr .1 %rx-fdli (or unbLisriJ) {rsi .huI 
lt% unr In .ill etiut .1 1 I i>n.i 1 %c?ttln<)S. Thr 
(Twijor srttions of this pub 1 i <,..m | on proviih- 
(juUljMCr to policy iiwikcrs .mil t«st usoi s 
blitzed un thr principles In the v.irious 
regu 1 4it I ons . 

The remainder of this publication has two 
goals. The first Is to present rilustra- 
tlvc statistics indicating why administra- 
tors and policy rrukers should be concerned 
with educational tests. The why Is' the 
evidence underlyinq the need to challenge 
discr ifTiinat ion and ^he Mnutlng of options 
for w<:KTW*n. Secondly, and A he main focus 
of the presentation, i s- raise a series 
of critical issues for each of the rnajor 
types of educational t es t s - -ach i evemen t 
tests, career* i nterest inventories, and 
aptitude m^a^ures. The purpose of 
'describing the issues \% to indicate 
needed policy, proceduies that can^formal- 
ize policy, and the why of the procedures. 
These issues will help to define and 
suggest at least some of the procedures 
that are necessary to make a judgment of 
the sex-fairness of a test of educational 
achievement, career interest, or aptitude 
used in selection for specific vocational 
courses, college admissions, or employment. 

X 

The procedures that would ensure sex- 
fairness for different groups of women are 
not well-identified. Issues in the test- 
ing of minority women and older women 
are not tre^ed separately in the present 
work. As alreviewer of an early draft 
commented, tVie issues of racial bias ate 
at least as domplex as those considered 
here, and th^ combined bias that faces 
minority women has nowhere been adequately 
considered. This is as true in the sex 
bias literature as in race/ethnic bias 
literature. The reader may find additional 
concerns identified by consulting a re- 
source such as the Psychological Testing 
of American Minorities {Skf'. 

my IS SEX BIAS IN TESTING IMPORTANT TO 
POLICY MAKERS IN EDUCATION? 

Earlier it was suggested that it is im- 
portant to look at educational tests be- 



i juse thry r p.iii ut thr Mudriit'% 
lontext ifi edui.ilinn ;iiid hri .uisp (hry ^re 
widely used. Und«rlyln(j thr prrsrnt con- 
tftti witfi rdiic .It lonal tests I1 .1 thrsis on 
thr ! eliil i(7n-;hlp between ediicritton for 
wof^)rn, tlir dr()rer'» they oht.iln, .ind their 
(»t cupat lon.i I rntry and career paths, a 
thesis that makes tests a concern to 
policy nuikers In education. 

There is a riddle that Is soa>ctlmes posed 
to illustrate the strengths of our stereo- 
types about" women. The riddle begins with 
a father and son driving on their way to 
the next town. They have a car accident 
and the father Is killed Instantly. The 
son Is very seriously Injured and is taken 
to a hospital. The son Is Irmediately 
taken up to the surgical ward. The surgeon 
comes In and says, "I cannot operate on 
this boy, he's my son." How do you explain 
the riddle? Listeners often puzzle over 
how the father could be dead and still be 
tKere to operat^^^^n his sonl The source of 
the Puzzlement is understandable. There 
are few women surgeons In this country, mod 
television and other media do not show 
women as surgeons and rarely ns doctors, 
while listeners may be aware that more 
women are entering medical schools today, 
they still may not immed lately find the 
logical soluNon to the puzzle. 

The reactions to the riddle are no acci- 
dent, but derive from 4 historical view of 
women and can be illustrated by historical 
views of the nature of women and men, which 
present a picture of what women are thought 
to be. The following quotations are some 
of these\iews of women and men. that limit 
our perceptions of individuals. 

/ We may thus conclude that it is a 
general law that there should be 
naturally ruling elements and elements 
naturally ruled. ..the rule of the 
freeman over the slave is one kind 
of role; that of the male over the 
female anot her , , • the slave is entirely 
without the faculty of deliberation; 
the female indeed possesses It, but 
in a form which remains inconclusive... 
(Aristotle) 

The chief distinction in the intel- 
lectual powe rs of the two sexes Is 
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thown by man attalnlntj to a hl()hrr 
eminence, In whatever he taKei up, 
than woman can i|ttaln-~whether 
requiring deep thought, reason, or 
imagination or merely the u%e of 
the semes and hands. (Darwin) 

As much as women want to be good 
scientists end engineers, they want, 
first and foremost, to be womanly 
companions of men and to be 
mothers. (Bruno Bettelhelm) 

Nature Intended women to be our'^ 
slaves; ... they are our property, we 
are not theirs. They belong to us, 
Just as a tree that bears fruit be- 
longs to a gardener. What a mad 
Idea to demand Equality for women! 
...Women arc nothing but machines 
for producing children. (Napoleon 
Bonaparte) 

These quotations arc reflected In the 
history of education for women In the 
United States. As Mathews (70) summarized 
this history. It was not possible for 
women to attend college in this country 
for^ 200 years after the founding of Har- 



vard In 16j(>. Obcrlln UecaMwn the t irikt 
collri>c In the nation to admit women. In 
I85S Llmlra Female College was founded; 
and In rapid succession after the Civil 
War the others later known as the Seven * 
Sisters were founded. The first state- 
supported college for women in the world 
was chartered In ]66k and is now known as 
the Mississippi State College for Women, 
After the Civil War, the number of women 
that were accepted Into previously all- 
male Institutions seems to be inversely 
related to the economic strength of the 
college. Established colleges In the 
East that had heavy endowments retained 
their exclusively mile student body. 
Colleges In the West and denomination 
colleges and state universities exhibited 
less resistance to women as students. 
Between lB70^and 1890 the number of 
colleges thaC^ admitted women almost 
doubled, and the number of female cgf+eoe 
graduates Increased five-fold. By 1900^ 
graduate and professional schools were 
opened to highly motivated women » for the 
iDost part. After 1920, the number of 
women college graduates Increased and 
continued to rise to the approximately 
1*0 percent of the graduating classes today 



TABLE 1 



Distribution of Females and Males in Vocational Education for Each Program Area . 1972 



Agricullurc 
Dismbutive education 
Health 

Home economics, gainful 
Office 

Technical education 
Tradesandusm 
Spiccial program* 

Total: gainful onl> 
Home economics, homcmaking 

Total gainful and homemaking 





1 LMALtS 






MaI IS 




{Excl 
Homr- 
maktng) 


Incl 

) 


s 


(Jatnful 

{^'<} 


tnci 
Home 

) 


\ 


' 1 


1 


48.153 


17 


17 


848.307 


8 


5 


290.o:o 


7 


7 


350.403 


8 


4 


285.071 


1 


1 


51.581 


7 


4 


240.948 


0.1 


0 1 


39.018 


51 




1.796.387 


11 


11 


555.491 


I 


1 


33.a)6 


6 


6 


3<W.<i63 


8 


4 


279.680 


43 


43 


2.1 18.288 


17 


9 


582.715 


15 


15 


72I.9(U 


101 




3.505.128 


W) 




• >>3I.2S4 




45 


2.916.987 




I) 5 


.:4S.745 




un 


6.422. 1 1^ 




101 


5. 180.02^ 



Sni RC f -Calculaifd from Bureau of Aduti \oc»l»nal »nd Ttchnical Educ.i.on Summon Data 



EJucaiion 
Mjv H"3j, 



' Include^ prevocaiionaK pfcpo*tvccondar> . ind remedial proframv 



HoUy h4% tJe\t f Ihrd {Ur \rirt\l In 

crra%r\ in ©n r i) I 1 n»rn t n In vdi »t t 1 i»f)»i I 
nliii iitlon. A% tDMa^jc enrol ln»nl% h.ivr 

cfll educdtlDo cnr uMaiefita IncrcaieJ f • uni 
roughly / ^ (i> 11.6 mil Men f i oin 1^)68 
to 197?. AtUllny to Inf'lurntr of 

vocAtlonal education tlic Higher 

Lducation Act of 19/2 Jut hor I t laii of 
$950 million for po*i t - ^ccondar y occuf)a- 
tiona) (i.e., vucallonal) education. ThK 
gave an impetus to Institutions of hUjher 
education to further expand voc<it lon.il 
educa t i on . 

However, the statistics for women in vo" 
c^tional education are less th.in encour- 
aging {32). As of 1972, tt>err wr»rr 
6.^ million women and girls enrolled In 



puhn< voi 4 t i < iri.i I [MiMjidrn^ the 
I oun try. Of t he -,e g I i 1 itnd wiMi>e(> , 

pel cant were heing trained In htjtiir 
ifwiking jnd h()(i>e eci>»UKnics .iiuJ anot^ter 
2B per cent i !i office practices. 
lUsellne Report Ll^ita (6,}) lndli.die\ ll\<it 
there h4ve heen changes I n thu var Ichjs 
jreas of vocational education, H(^ever , 
t he data con t i r^ues to thow that vory few 
w^omen are being trained for the 20.1 
million Jobs that the Katlonal Planning 
Association estirtwites will occur In what 
have been viewed prlrr^arMy as male occu- 
pat i CHI s, including the bet ter-pay I ng 
trades, industrial, on^ tec hn leal Jobs 
for which hlijh scfiooli now offer voca* 
t I ona I courses with entry-level skills 
p r epa r.i t 1 on . 
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similar %« « * t«gr sga t •() J I « t r I t)u t I oro ociur 
I Uoib hl(jh«r education aruf latxif force 
pA r t i I I pAf I on in (jrner e). W(vitrn w\>r k In 
4II occupational lategorlr^, hut thry air 
I oru ent r ot ed I frwri occupational cafr- 
ijorle* than nwn (IM) Wom€<n lon^titulr 

percent nf the i I v M I an i>or>- 
I n^ t I I u t iona I poputatK>n 16 year) old and 
over , but r^)re w^wMjn are employed part 
t I fT»e (6^ p^rc«rit), arid wA>cnen const I tut© 
only a third (3I prrcent) of all pern<xis 
employed In f>rofe%*lonal-technlcal iind 
non-farm manageria? administrative occu- 
pations. Women are predominant (68 per- 
cent) In the persons employed In clerical- 
sales occupations. The greatest rate of 
Increase of women In the labor force be- 
tween i960 and 197s Is (or women with 
young children. The Increases were frtxi) 
I ^> to 3J» l>Pfcent Jfni>ng wi)fT>«fi with children 
ur^dc r the aijc of 3. «^nd from 2 5 t(t ^* per- 
cent am<>nqa^mrn with children between 
the ages of 3 (I)- Equity In edut/i- 

tion and earnings .ire lfn(>ortant to .ill 
working worr^en . Alrhojt^h women m«ike up 
^2 percent of the Jit) rk force » they receive 
only 2S percent of the total r,irnc<1 w.ijjr*. 
golncj to A/T>er ic.in worker*. (8?). Tf^e 
jverage wcKnon e,irrj% 60 percent f>f wh.it the 
«iverage flVin earns- -,i snviller share than 
20 years atjo, when wc>m,in ' s paychrck 
<i\/rra(}Ci\ G^ percent of the avrracje rrvin ' s 
(80). 

In higher education the proportion of 
women in sciencr is Snail and the propor- 
tion drops -It each higher level of degree, 
salary, acadernic rank, and adrn i n i s t r a t i vr 
responsibility (Vetter, 115)- A/nong almost 
207.500 science and engineerin'g Ph.D.s in 
the U.S. labor force, 92 percent are male. 
The proportion of w«>mcn enrolled and grad- 
uating In these fields was higher in the 
I92O5 than in any deca,dc sincc^ but now 
appears to be rising. In the field of 
chemistry for example, according to 
Vettcr, women have earned 19 percent of 
the bachelor's degrees, 20.8 percent of 
the masters, and 7.3 percent of the doc- 
torates since I96O. Women earned 11.17 
percent of the chemistry doctorates in the 
period 1973''1976 (National Research 
Council, Women and Minority Ph.D.s in the 
1970s). H6wevcr. at institutions aw,ird- 
i ng the doctorate in 1973. only 2 percent 
of the chentstry faculty above the level 
of instructor we re- women, and only 1^.8 
percent 0*" federally c-ployed chemists 
all degr;ec levels were women. 



[fie re \\ a lack of data showing % I gn I f I - 
I artt dlfferer>Lr« In t«l«nt that w«>u1d 
account fi>r the illscrepaiicy In educatlofi, 
vocational entry, arul career attainment, 
ffie nu_i\ I ( (Mnpr ehen^ 1 vm rev I ew of \eM 
dMfeierue^ in the psychological liter- 
atur^lM* tteen ( arr lod oiit by Hac ( oby and 
J^cklHi. (67). At t^o conclusion of t h<s I r 
de t a I I ed r ev I ew of t fie psyc ho I oy I ca I 
literature on lex differences, lh©y coul d 
find few conslltont dlfforoncai In per- 
formance, fven In the few diffarence* 
found, they were I nc I I nod to omphatlztt 
considerable overlap of tho distribution 
of male and female abllitl<i« or lal«ntt 
on whatever psychological dlrr^enslon was 
being measured. There are, for example, 
many girls with high-level skills In 
mathematics. Just as there are many boyl 
with high-level skills I n t he verbal 
area. tw<i traditionally stereotyped arees 
of fnen ' s and women's skills. There arc 
mf ) r e than enough woitien to fill engineering 
sc hoo I s If women 's talents were dcvcl oped 
throiMjh the requisite early training and 
interests. According to Maccoby and 
J.^c k I in: 

Won^en .1 r c now considerably under- 
represented In engineering in terms 
of any criterion by which potential 
talent can be measured. We have no 
wish to push women toward careers 
t hat do not attract them . At the 
satne t i rT»e we be I i vc it would be a 
grievious injustice to establish 
f or ma 1 or 1 n forrr^ I quo tas tha t wou I d 
exclude any women with the requisite 
talents and Interests. We arc dis- 
cussing quotas that exclude women 
because, historically, women have 
been excluded from training for 
high status careers more frequently 
than men, but of course the argu- 
mcnt applies in both directions. 

Applicants or students should be assessed 
and counseled on the basis of their 
measurable talents, not on the basis of 
probabilities ot attainment on sex. 

There appear to be sorr^ documented sex 
iifferences in performance, including 
the V 1 sua I - s pa t i a I skills areas, math- 
ematics aptitude and achievement at later 
stages of education, verbal perforrr^nce. 
and perhaps some social behaviors such as 
aqf^ress ivencss . H^^ve ve r, in the research 
Sur/cyed, there are problems in establish- 
inq comparability of groups in experience 
and education levels, of possible bias In 
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measures, >and of rrveasurlng particular 
abilities and social skills In a wide 
variety of contexts. There IS a great 
^ .dansfer In stereotyping males or females on 
the basis of mean differences In grbup 
;^^5^^j^^rfoniiance In view of fexterfilve overlap- 
^plng'an»n§ the groups on dTstrlbut Ions In 
all a^ J tljes^ measured. ^ . ^ 

One area. In which, Tt ,appears thc^e ^re 
. dIffcrenc^s In the variables measured. Is . 
- In career ilnter.es't 5. There are djffer'' 
' ences amoKg female^and male responses to 
existing Itciii pogjlsr .However, the dis- 
" ^ c^ksslon In the section on Interest ^ 

^ iriea$uWmei?f P^|»ts out that these daSa aW^ 
V 1 United by b^(^ based on sets of l^ems i^ 
that havte shoi^ sex cflfferences over long 
periods time, and currently may not be 
related to the measurement of interests ^ 
of specific occupational groups. 



Policy npkers need- to be. concerned that 
tests may function to reinforce existing 
performance patterns. For example; If 
Interest measures do not provi de the same 
sets of fcareer x^ptlons ta males and 
females, then they are likely to reinforce 
existing occupational distributjons in the 
labor force. Similarly, the r>sults of 
aptitude measures that students take with 
differential experience prior to the' test 
may also serve to reinforce existing 
stereotypes of skills of females^and 
males'. The results of tests of education- 
al achievement in mathematics may bemis-- 
leading for women and those who are using 
the test scores, if the amount of math 
experience that females and males have is 
not equivalent. These examples are ex- 
amined In more detail in the issues 
sections for each of the major types of 
educational tests. 
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SEX BIAS IN ACHIEVEMENT TESTS 



OVERVIEW: CONTENT AND CONSTRUCT VALIDITY 

^ , The .simplest definition validity is 
^ captured in the questloh: ^ Are we measur- 
ing' what we thfnlc we are measurlngj? If 
teacher^ construct ^ teSt'td asse?!' the # 

"^acT^i evcinent pf the irV ninth grade students ^ 
In algebra^ or If th*y want to select a 
T standardized te9»t to Wasure this achieve- 

* ment, they are concerned with content 
valld'Lty. Typl-tally^content valid! ty is ^ 

^assessed by examining the match between ^ 
the curriculum and the test content. For 
standardized achievement tests, test 

' publ ishers develop a set of specifications 
tbat define the domain of content to be 
measured by the test. Typically, this 
content jdpmain speci f les^bpth the sub- 
stance and the prdcess';<or task behavior) 
to appear in, the test. By substance, fn 
arithmeticifachievement for example^, the 
test specifications might be ^dpcerned 
with whether /students are answisring 
questi.ons on fractions or whole pumbers^. 
The process dimension refers to whether 
t>fey are asked to. add, subtract, multiply, 
yor perform other types of manipBlations on ^ 
the substance or. content. Content valid- 
ijty, then, is concerned with the repre- 
^entatlveness or. sampling adequacy of the 

^*est items for the domain to be measured 
and is typically assessed by judgment. 
The persons selecting^ a test will provide 
either formal or informa.l sets of judg- 
ments as 'to the content validity of the 
test for their particular" group of stu^ 
dents and their curricula. \ 

Strictly speaking/ if a student does not 
answer questiohs on a test, the resulting 
low %core is a valid report in one sense. 
The danger in drawing this conclusion 
immediately was described by Messick (75). 
, Suppose a deaf pupil 'has been given a 
spell irug tes^t by dictation. Although the 
low §cdre is a valid report that the 
pupil d\4^ji^t spej I from dictation, the 
inference that the pupil lacked the ^ 
ability to spell those words Is much more 
tenuous. One can infer from correct per- 
formance that the student possessed the 



requisite .abilities. ' However, to make 
the inference W inabi Hty or Incompe- \ 
tence f rpm the ab^ehtc of correct per- 
formance requires the elimination of a 
-number of plausible rlva-l hypotheses r-^ / 

Psychologists have used, the term construct 
va I iVi i ty to makei^ more apparent the nature 
of Inferences that are drawn frqfciNtcst 
^'cores. If a student has a low •soore on a 
test, we want to know the meaning of the 
test score. Does ^he absence of correct^ 
performance mean a student has not had an 
opportuni ty^to learn the material? The 
consti^uct validity of Jjfchievement tests 
Is concerned wi th demonstrating that the 
opportunity to learn variable is the main 
reason that students may not have 
achieved a desired level. Plausible 
rival hypotheses for lack of a correct 
performance need to be ruled out. 
Status characteristics su^h as sex, , 
ethnicity, and socioeconomi c status should 
not be the major variables accounting for 
individual differences in student per- 
formance on achievement tests. 

Another sense of bias is the presentation 
of test material that Is sex-role stereo- ' 
typed and not a fair representation of 
women's status and achievements in our 
culture and in history. Sex-fairness in 
representation of women in occupatipnal 
roles and scientific achievements is the 
first issue discussed below and is one 
asf^ct of the "face validity'' of an ^ 
achievement test.^ The effect of sex- 
fairness of representation of women may * 
--or may not be reflected in achievement 
test performance. (Evidence bearing on 
this issue is described later.) Whether 
or not ."blamed'* content is reflected rn 
performance is irrelevant to the larger 
issue of fairness of representation. 
Test content, to the extent that illus- 
trative materials can represent both 
sexes fairly, should be representative 
and face valid on a basic principal pf 
Justice or fairness.^ 
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^ . The efforts to cxartinc I tern or test bias 
from a statistical or empirical viewpoirlt 
have an Important role to'pldV and ^re 
anothei' -part of the procedures to elim- 
'^i^inate sex bias and provide a sex-fair*^ 
te^t. In the l,ong^un» however » because 
there ar^ still n^ny c^ulturally diffei'ent 
experiences b<>ys and girfs, it will be 
some time be fore. a 11 differences between 
the' sexes In pcrforrhao^ on ja^hi'evement 
tests are elimlnaQtd. However, the Item 
. bias studies can help to reduce the amount 
of irrelevant differences. Empirical 
study of items can/ reduce at least part of 
%^ thej^ex differences Fn performance that 

are not attribut^ble.to d(rect Instruc-i?/^ 
' t lona K experience, ^bme differences may* 
be atlfributable to earlier experiences. 
' For e*xam^>le^ boys are more likely to ha^e 

experience with mechahics, electrical^ 
repair, and ^o on. Items that embed 
mathematical processes in '.these contexts 
^ may be easier for boys than for girls 

.because of their greater fami 1 iar I ty ^wl th 
the type^ of situation in the Item* If 
the same mathematical process can be-' 
"""^^^ measured with a sex-neutral coiTjent, one ^ 
on which boys and girls on the average 
perform similarly, this type of Item is 
not ''sex biased" and is preferable if it 
meets the test specifications for a sex- 
fair and content valid test. 

The discussion below and many of the 
studies cited are based on standardized 
(norm-referenced) tests. This is because 
many of the stijdies to examine sex or 
minority group bias are conducted in 
large-scale assessments of pupil achieve- 
ment that use standardized tests. ^ How- 
ever » the Issues and pollcies^^ow also 
apply to criterion or objectives- 
referenced achievement measures. The 
examination of Items for their face 
valid'ity, or fairness of representation, 
and for statistical or empirical bias is 
. equally mandatory for both norm- referenced 
achievement tests and for objectives- 
referenced achievement tests. 



ISSUE: . OVERT SEX BIAS AND CQf^TENT 
VALIDITY 

An earlier revjew (112) of the portrayal 
of women in educational tests included 
this summary: 

Women are portrayed almost exclu- 
sively as homemakers or in ^the 
pursuit of hobbles (e.g., '^Mrs. 



'Jones, the Presi^lent of tjje ' 
Garden Clujb...")* Young girls 
^aary out ''female chores" (e.g. , 

'"father helps. Betty and Tom build a- 
playhouse; when ]t>is completed, 
"Betty sets out Wishes on the 
table, while Tom , carries' In the ^ 
chairs..."). - • ^ 

, In numerous actlvi tyceatered 
items » boys are shown playing^ 
climbing, campTng, hiking, taking 
on roles of responsi.bi Hty and 
.leadership. ,^irls help with the 
[^cooking, buy ribbon and vegetables,^ 
and when participating Iq any 
active pursui t,^take. the back seat 
to the stronger, more qualified 
boys (e.g., Buddy savs to Clara, 
"Oh, I guess it's alKright for 
us boys to help^girys. I've done 
some good turns for 
because I 'm a scout. 



girls myself, 
••) 



items in achievement t^sts have conveyed 
the impression that the majority of 
professions are closed to women. A 
reading passage about the presidency of 
the United States discussed the pres- 
ident's characteristics and qualifica- 
tions and included the statement, "In the 
United States, voters do not'directly 
choose the man they wish to be pres- 
ident." Routinely, teachers are 
described as females, whi le. professors^ 
doctors, lawyers, and presidents of 
companies are listed, as male. These 
statements apply to the achievement tests 
as revised, for overt bias in 197^. The 
tests vl%e not unknown tests, but were 
"the achMvement tests published by the 
major .test publishing companies. And the 
tests examined carried publication dates 
from 196^ to 1971. the analysis of the 
achievement tests showed the same results 
as the analysis of tfther types of educa- 
tional materials that have beep examined 
for discrimination against women. (See, 
for example, 59, 3^, 93, 53, and 37.) 



These stuclies of instructional materials, 
as well as the study of educational 
ach i evement tes ts , provi ded extens i ve 
documentation on one type of overt 
sexism, that of sex-role stereotyping in 
content. A second way of examining ovej^V 
sexism is to Took at language usage. 
Another estimate of content bias may be 
o'btained by determining If males are 
referred to more often than female's.' 
Reading comprehension, social studies, 
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science, afvd-^4ier tests may w€^l 1 cite 
male novelists more frequently than 
female, describe male .Sclent i sts ' 
activities more frequently, and so bujjd, 
or rather reinforce, the view of the 
pccupational World as/tnale oriented. 

There is some evidence that sex bias in 
content may largely arise through coatent 
selection, v The preparation of the nfew' 
American Heritage Dictionary Involved the 
computer analysis of 10,000 500-word . 
sampKe* from 1,000 of the most frequ^tly 
used\lnstriictlonal publications in repre- 
sentative schools across the coufftry (68). 
The analysis of thes^ school materials 
showed evidence of male orientation. The 
wor<t boy or bbys appears 5^,700 timejs 
versus 2,200 for girl or glr1s > the 
20 giVen names most frequently used, 13 
were male and 7 w^re femalp. Thq;se are 
speci'fic liistances of male orientation in 
school materials. Several writers have 
discussed the general ^Fiale orientation of 
theiEnglish language, fin what appears to 
be sex- typed -use of language (103, ,59» 
108). 

An estimate of the weighting of content 
toward males or females can be obtained 
by computing rations of frequency of usage 
of male nouns and pronouns to female 
nouns and pronouns. One factor that 
apparently contributes to bias Is that 
the Engl ish language has no singular 
pronouns equivalent to the plurals they , 
the? r , and them . Common usage attributes 
"maleness" to most occupations, for 
example, the carpenter .. .*'he/' the coun- 
selor. . .^'he," and the writer (author) 
...•'he/' 

A study by Tittle, McCarthy, and Steckler 
(112) examined whether overt bias arose 
from content selection or whether It 
could be attributed ''to the common use of 
language and the generic nouns and their 
proncrtjn referents (i.e., references to 
such nouns as mankind , chairman , f I Sher- 
man ) . I'n order to separate the content 
selection and usage factors, two ratios 
were computed: I) the ratio of the 
count of the frequencies of male nouns, 
and pronouns to female nouns and pro- 
nouns, using only regular nouns and 
pronouns; and -2) the ratios of counts of 
all male nouns and proaouns to all female 
nouns and pronbuns including generic nouns 
and pronouns, ^ a count labeled all nouns 
and pronouns. 



^ \ • 

These ratios provl de the t^s I s for one 
examination of sex bla-i In language 
usage In achievement te$t5. Eight of th< 
most^/f requently administered achfevement 
test/ batteries were analyzed and the/ 
ma}or findings were: \ / 



(1) 



/ 



^\ thO! 



There were few differences 
between the conclusions d^rawn 
basged on the ratios \ using 
all nouns and pronouns and 
those using regular nouns and 
pronouns only (I.e., ei^c lad- 
ing generic noun and pr^^un 
referents). Content-4i|a^ in 
- favor of males did not appear 
j to be prlina.ri ly a functiott> of 

/ word chofce, but rather of 

content selecltlon. 

. (2) With one exception, each testX 
battery showed a higher fre- \ 
quency of usage of ciale nOuns \ 
and pronouns than of fenfale 
\ nouns and pronouns. The 

range of the ratios was from . 
.86 (slightly more female 
references than male) to a high' 
of 14 (I't times more use of 
male nouns and pronouns than 
female). 

(3) ^ The number of subtests *jn each 

achievement test series with 
ratios below 1.00, where more 
female than male nouns and 
pronouns were used, ranged from 
none (out of 7 subtests) to 5 
subtests (out of 9) • The 
majority of achievement test 
batteries show few subtests 
with ratios at or below 1.00, 
another Indication of the im- 
balance in the reference^ to 
females and males in the 
achievement tests analyzed. 

(4) The extent of the Imbalance 
was shown in the five subte^^ts 
with the highest ratios of 
frequencies of male nouns and 
pronouns to female--84 : 1 ; 
8i*:l; 69:1; Al:l; and 33:1. 
These figures indicate that 

in two subtests 84 male nouns 
and pronouns were counted and 
. only one female noun or pronOun. 

These findings were also confirmed in a 
study carried out by Donlon, Ekstrom, 
Lockheed, and Harris (22). Thus the use 



of language and sexTole stereotypes con- 
firms the under-representat fon of women 
generally and an over-representation of 
women in traditional settings. It is 
possible to improve the counts and find- 
ings as shown In the development of the 
1978 edi tion of the Metropol i tarf Ach ieve- 
ment Tests (MAT). Jensen and Beck (5^) 
reported on a gender balance analysis of 
the new MAT. The publisher balanced the 
presentation of females and males and 
portrayed both Sexes in less stereotypic 
roles. There was a marked difference 
from thp 1970 edition to. the 197ff edition 
df the MAT. Although some tests retained 
anMmbalance, the overall balance was 
improved, changing from a median ratio of 
male nouns and pronouns to female nouns 
and pronouns of 2.95, to a ration of 1,1. 
The study of the MAT also examined tradi- 
tional and nontradi tional views of women 
»n four categories: occupations, activi- 
ties, roles, and enx>tions. One interest- 
ing finding of this analysis was that 
there was a greater tendency for females , 
to be better represented in both tradi- 
tional and nontradi tional categories 
occuf^ations and act ivi ties , than for 
males. 

Of some interest in relation to *'sex bias" 
in language in achievement tests is the 
relationship between sex bias in language 
and performance on an achievement test. 
As noted earlier, there are good reasbns 
to change the overt sexism in language In 
achievement tests as the MAT has done, 
regardless of whether or not there is a 
relationship to test performance. On the 
other hand, the topic is of some interest 
and there have been studies of th^e re- 
lationships. J Plake, Hoover, and Loyd (8^) 
looked at the^ differential item perform- 
ance by sex on the Iowa Tests of Basic 
Skills (ITBS) for grades 3 through 8. 
Although some items were foljnd on which 
there were different performances by males 
and females (stat istipdi ly significant) , 
the differences were hot practically 
significant. Plake indicated that her 
results provided little support for the 
idea that test content as it relates to 
sex^role stereotyping or the frequency 
of sex- i dent i f ied nouns^^nd pronouns 



affected performance on mathematics or 
other test items. 

Donlon et^ aj_. (22)^ did find some evidence * 
of differential "performarvce by females and 
males for Items where the test content 
reflected more male nouns and pronouns 
than female nouns and pronouns. The 
measure that the Donlon £t,^*^^"*^y used 
was sex difference In passing the Item 
(male percent passing minus female per- 
cent passing) correlated with total male* 
references and total female references. 
This measure showed a few differences 
except for the STEP reding test at 
grade 10.^ Here there were moderate . 
correlations between sex differences in 
passing and total male references and 
total female references. More generally 
stated, there was a very moderate but 
significant tendency for females to do 
better on items that contained more 
female references. Other findings of 
Interest in the Donlon study were that 
females and males were highly similar on 
such factors as their rate of work in 
•completing the test and the number of 
I tems omitted. 

In addition to the studies by Plake and 
Dpnlon et^ aj^. , three other studies have 
been reported that examine sex differ- 
ences in per forma n<Sfe^on tests of math- 
ematics as a functign of^-t4MLjtem con- 
text. Two studies used pupi 1 s3>-44ie 
elementary school level and looked at sex 
differences in performance for items in 
which the activities were familiar or 
stereotypical ly for men or women. King 
and Blount (60) used a teacher-developed 
test and reported differences in 
performance of sixth grade students in the 
direction the opposite of prediction. 
That is, the girls did better dn. items 
with a masculine orientation, f^wder and 
Chase (78) rep^orted no sex dif:ference in 
performance of third grade pupils when 
respond*f|g to sex-typed Items. Sixth 
grade data showed an Interaction effect, 
with girls doing better than boys on the 
"male-bias'^ test. McCarthy (71), 'how- 
ever, found more systematic differences 
in performance , when using a sampje of 
high school students In grades 10 to 12. 



Items were^ analyzed for the following tests: California Achievement Test V^^^^ ^* Form 
A), Iowa. Tests of B-^sic Skills (Levels 11 and U', Form 6), Metropol i tan Achi^evement Test 
(Grade 12, Form F),^and The Sequential Tests of Educational Progress . (Grade 10, Series 11) 



Her results are more In line with the 
general ^rend In the research literature 
toward nO dl f ferences in math performance 
between the sexes in the earty levels and 
some findings of sex differences at the 
later, high s^Thool grade levels. 

McCarthy constructed an Item pool, varying 
the item context and holding the math- 
ematical process the same. She had a 
subsample of students rate each Item on 
a scale of L^to 5> 1 being of gr«at 
fami1 iarit^lj^to males, 5 bePng of great 
famiyarity to females." Items selected 
or the final test were definitively rated 
male* female, or neutral. She used the ^ 
relationship between passing the item and 
performing well on the test to select the' 
26- item "best" tests for the Total group, 
for Males only, and^for Females only. 
With this approach, the items selected for 
the Total group Included 10 male i tems^ 
^ female Items, and 12 neutral items. The 
items selected for the Male group included 
11 male items, 6 female items, and 9 
neutral items. The Items selected for 
the Female group included 2 male items, 
14 female items, and 10 neutral items. 
The item overlap was: the Total group 
and Mdle group had 20 of 26 items the 
same (77 percent overlap) ; the Total "group 
and Female group had 15 of 26 items the 
same (58 percent overlap); and the Male' 
group and Female group had 13 of 26 items 
the same (50 percent overlap). An analy- 
sis that controlled for attitude toward 
mathematics and verbal ability did not 
reduce the significance of the differences 
found: males scored higher than females 
on the Total group test and Male test; 
females scored significantly higher on 
the Female group test. 

McCarthy's findings are consiSutent with 
earlier stucJles indicating that by the 
high school or college level, women 's per- 
formance on tests of mathematics involving 
problem solving may be affected by their 
fami 1 far i ty wi th (or attitude toward) the 
context in which the problem is posed. 
Milton (76) reported a series of five 
studies using high school and college 
students in which he showed that problem- 
solving d i fferences- between the sexes 
were reduced (but not el ImTnated) when 
the problems were written in content 
apgropriate to the feminine role. 

A related study was conducted with an 
"aptitude" test. Strassberg-Rosenberg and 
Donlon (lOA) examined sex differences in 



performance on mathematics lt€Ans on '^the 
Scholastic Aptitude Test and found that 
more Items were "biased" (had higher 
percent passing) on the average in favor 
of males on the regular math items and 
data sufficiency geometry items. Items 
blased^-in the female d i rect ion were 
algebra, regular math items, ahd five 
miscel lanedus regular math Items. These 
.fifOdings were similar to DonLpn's (21) 
earlier stiidy of SAT items, Irf.which he 
found I teii\s with atgebra co'nteQt^to be 
ea?Ier, on^the average, for women than 
the other Items. His earlier study also 
found a definite masculine tenor to the 
content of 17 items with real-world 
referents. 

The summary of what Is known about overt 
sex bias in achievement tests can be 
stated as follows: studies have doci^- 
mented the bias in favor of males in 
selection of item content and that there 
are sex*role stereotyped views of women 
and men. There is some evidence that . 
fami 1 iar I ty with Item context may affect 
perfdrmanceMn mathematics tests at the 
high school age and also that more fre- 
quent references to women may affect per- 
formance at the high school level. Both 
of these latter findings are based on few 
studies and need to be replicated. How- 
ever, there are clear implications for 
policy from the findings that are well 
documented . 



Policy 

Test publishers are taking steps to elim- 
inate overt sexism from tests and the 
"accompanying test interpretation mater- 
ials. However, the policy decisions at 
schools and other educational gr'oups 
selecting achievement tests* should focus 
on developing standards In several areas 
for use In test reviews. %ils section 
lists several of the types qf categories 
and procedures that have been used ^y 
publishers or in studies of tests and 
instructional materials to examine sex- 
role stereotyping and sex bias in ^ 
language usage. These can be adapted for 
school use. 

A readily used form was developed by 
Carol Jacobs and Cynthia Eaton (53). The 
form was originally developed to evaluate 
sexism in readers for the el^m^ntary 
school and is IHustrated in Figure 1.^ 
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figurFi 



Tally Form for Reviewing Educational Matferials 
(Jacobs ahd Eaton, 53)* 



EVALUATING se^CISH IN READERS 



MALE ^ F^MALt 



1.. Number of stories 

wtere main character is: 



2. . Number of Illustrations of 



Number of t imes chl 1 dren are shown: * / 

(a) in active play " - 

(b) using initiative 

(c) displaying i lidependence 0 

(d) solving problems 

.(e) earning money 

(f) receiving recognition 

(g) being inventive 

(h) involved in sports ^ 

(i) fearful or helpless ^ 

(j) receiving help | ^ 

Number of times adults are shown; 

(^) in different occupations ' . > 

(b) playing with children 



(c) taking children on outings / 

(d) teaching skills ^ '* ^ 

Ce) giving tenderness / 

(f) scolding children 

(g) biographical ly / 

5. in addition^ ask yourself these questions: Are boys allowed to show their 
emotions? Are girls rewarded for intelligence rather than for beauty? Are there 
any derogatory comments directed at girls in general? Is mother shown working 
outsid^ the home? If so, in what kind of job? Are there any stories about one- 
parent families? Families without children? Are baby-sitters shown? Are 
minority and ethnic groups treated naturally? 



V 

Reprinted T^y permission of Today's Education , The Journal of the National 
Education Association. v — 



Education Association. 



• Table 3 

C^cgorics Used, by Jaci^jm to Examine Sex Bla< 
(Saarto, Jacklln^ ^nd Tittle, 93*) 



K Main ahd secondary characters 

2. Type Of environment: 
X Home • % ■ / 

0 u t doc r s - / 

' l^lace of business 
School ' V s' ' 

3. Behavlorrexhibited: -s^^^ 
Nurturant (helping, praising^ servl,,-) 
Aggressive (hitting, kicking, verbal put-downs) 

Self-care (dressing, v/ashtng) ^ • 

Routine-repetitive (eating, going t:^ school) ' 
Constructive-productive (building, writing story, piannlng party) 
^Physicaily exertive (sports, lifting heavy objects) 
Social-recreational (visiting someof^e^, card games) 
Fantasy activity, (doll play, cowboy^ :^nd Indians) 
.^Dir^tive (initiating, .directing, demonstrating) 
, \ Avoidance (stop trying, run away, z^^^t eyes) 

IStatement about self—positlv^, neg^^lve, neutral ("I have blue eyes.'*^ 
^ ''I'm too stupid.") ' . ^ 

Problem-solving (producing idea, ur^jsual combinations) ' 
Statements of iTiformatfon ("I know, r"; non-evaluative Observations about 

other people) 
Expression of emotion (crying* laughing) 

Conformity (express concern for rul^g^ social norms, others ' expectat ions , 
do as told) . 

General verbal (trivial motor behavj^^ such as dropping something, looking 
for something, listening) 

4. Types of consequences: 
I Po^i>^ive consequences — 

\From others— directed toward subjects (praise, recognition, support, signs 
^S.of affection) 

From self — self-praise, sati s'f^actiOf^ 

From situation — reaching goa1> unin^^^ded positive results 
Chance* 

Author's statement, text 
Negative consequences — 

From others— directed toward subject: (criticism, correction, rejection 
of Ideas) 
. ' From self 

From Situation—inability to reach g^gi, unintended negative results 
Chance 

A Author^Vjistatement , text 

Neutral consequences" ^pt clearly Pc^^jtive or negative 

*Saario, T.N. , Jacklin, C.N. and Tittle, q^k. Sex Role Stereotyping In the 
Public Schools.' Harviard Educational R^j ^fgw. 1973, Vol. ^3, p 392-3 
©1973 Pros ident and Fellows oT Harv^^^cbl lege. 
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The '*unit^" (Items or reading passages) 
of analysts can vary and 1 1 lust rat i oris are 
Included for ^nalysi s. ( 1 1 might be us«- 
.ful, ^Particularly for the kindergarten and 
early reading tests,. t<>tanalyze :pl ctures 
^ on a separate tal ly form from the texts 
^ \of Items, since pictures are frequently 
/used In the early tests. 

A more extensive category system was used 
by Carolyn JackUn in a study of early 
readers. Each character In each story 
^ was classified by age and sex and was / 
coded on five additional categories: 
a. occurrence as a m4ln character; 
b^ occurrence In specific environment; 
c.^vbccurrfence as exhibiting specific 
behavior; d. occurrence as bearers of 
specific consequences; and e. occurrence 
as recipients of specific behaviors and 
Mjpsequences. The categories us^d by 
Acklln are listed in Table 3. 



^HJ[cl in found satisfactory reliability, 
that'^^*^. consistency in agreement among 
raters in^l ass if ying characters In 
stories according to these categories. 
This elaborate cJassi f Jcat ion scherf^e 
permitted Jacklin to construct several 
highly informative tables to summarize 
the data. Tables k and 5 are two 
1 1 lustrat lons^of the summary tables 
that are possfble. 




Table d - ,^ > . ' 

Total Number of Characters Tn the Sampled 
Stories Displayed by Person-type* 

(Saarlo, Jackllti, and Tittle, 93) 



Feniale 



Ctrl Id , , 


. Adu 1 t 


IflJAt 


i 


124 


365 




256 


580 


565, 


480 


9^5 



f3 



Table 5 

Number of Main Charadters by Age and Sex* 

(Saario, Jacklin, and Tittle, 93) ^ 

Female Male 

A Ail ts ^ 
Nji(ml?er main characters 7 33 
fotal no.^ in Tories 12^ 256 

/ Chi squa're - 3-95; 
^ df - 1, p .05 

Children 

Number main characters 6l 110 
Total no. In stories 2^1 32^ 

Chi square - 3I^Si 

df - 1, p .05 



Another form of summarizing the data was 
used for the environment category (in 
Table 6) and the same presentation was 
also used for types of behaviors 
performed by children and far types of 
consequences for children and adults. 



*Saario, T.N., l^acklin, C.N., and Tittle, 
C.K. Sex Role Stereotypt ing in the Public 
Schools. Harvard .Educational Review, f973. 
Vol. A3, p- 39^ © 1973 by the President 
and Fellows of Harvard ColleA^e. 
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Table 6 >' , '/ 

Types of Environments Fn Which ChildrAn_(C) and Adults (A) of Each Sex (M/F>, are ^ 
Shown Given ITi Frequenc les and In Percentages of totaJ Environments Shown by Each 
%ge and Sex"*". • v: \ > ^ - \ 



Frequencies and Percentages^ 



y 

Environment 

J 


CF" 
n-24l 


... • CM 
n-32A 


^ ' — 

' AF 
n-lii» 


AM 
n-256 


Home 


.97 

f 


' , ^ in ' 

29.0* 


81*** 
\ 54.€* 


23.6* 


Outctoars 


157 

55 -'2% 


23'* 
61.1* 


J. ^JLJLJU 

• 30.91 


1 n't 
57.6* 


Business ^ 


15 ^ 


'4.2* 


8t** 
5J3* 


16.0* 


School 


15 

5.3* 


<22 ' 
. 5.7*if^ 


9.2* 


7 

■ , 2.8* 


Totals: Frequency 
^rcentage 


\00% 


lOp* 

( 


152 
100* 


250 
100* 


- « p<.05 
** .* p<.oi 




• 







i-siKi-s « p<.00l 

\ 

'*'Saario, T,N., Jacklin, C.N,, and Tittle, 'C.K. Sex Role Stereotyping In the Public 
Schools. Harvard Educational Review , 1973, Volume ^#3, page 396© 1973 b^the President 
and Fellows of Harvard College. " 



The use of a category system as extensive 
as the one developed by Jacklin has" sev- 
eral advantages. The system can be used 
with curriculum materials as wel 1 as 
with tests and is deta I 1 ed enough to 
permit discussion between teachers and 
students about sex bias and jnstruc- , 
tional materials, If the school de- 
sires. 

The category systems used to analyze the 
content of educational tests have been 
less well defined but are also useful. 
The Donlon et_ aj_. study .(22) classified 
the various roles attributed to females 
and males in the test content and also 
the relative status of^male and female 
roles. Words that showed vocations, 
avocations, or special functions of 
people {for example, doctor, mother) 
were coded as roles. Roles were not 
.inferred from the descriptions of in- 
dividual behavior* For example, the 



role of "house husband" was not in- 
ferred from the sentence, "he cleaned" 
the house and fi-xed dinner." The 
identification of particular roles as 
female, male, or neutral was decided 
by the percentages of females and males 
found actively engaged in that role as 
documented by the Occupat lonal Charac- 
teristics, 1970 Census of Population . 
When the Cenilis showed 80 percent or 
more of the individuals engaged in an 
occupation were one sex, the occup^ation 
was defined. as a sex-typed role. Other 
occupations were classified as neutral 
roles. Historical consideration of 
roles was handled by general knowledge. 
If an Item involved a role that was 
generally known as restricted to one sex 
(e.g., knights, the congressmen of I8OQ) 
it was coded as a sex-typed role. The 
assessnent of relative status required 
both females and males to be present 
in an item. (The description of this 
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cq^lng system Is available In Donlon 
ct^ al^. 22. ) " . %7 4 

The coding system ustd for the study df 
the Metropolitan Achle^jnefrft Test (MAT) 
content included ah a'^n^y^^s of female 
and mal^ M lustratFons^s vieTK The 



analys Is| of 1 1 lustrations categorized 
chlldran^and adults sep«r^ely» accord* 
ing^to three categories: male^taller 
fhjin female^ female taller than male; 
female same as male. The classification 
of Illustrations was In response to a^ 
criticism that boys are often shown as 
tajfer than girls, when In fact chllcfren 
of the ^ame age are very similar In 
^height regardless of sex. Also, the 
Illustrations were examined for the por-^ 
trayal of boys and girls and wog»cn and 
men /n mixed-scx act tvl t ies .h| 'G?oup 
activities was another categSry I .e.\ 
single sex versus I xed sex groups, that 
maV be useful for content analysis. 

Four categories used In the "gender- 
balance'* analysis of MAT wrutten mat- 
erials are presented in Table 7» 



Pronouns th^t are sex- linked have been 
used as another way of defining sex 
bias tnjeducatlonal aqhlevag)ent tests. 
Two types^ ^f systems have been use^ here 
and eltner seems appropriate for use oy 
schools. In one type of analysis only 
the male and female nouns knd pronouns 
are counted and^^cat^orr^itfi^ In another 
approac;h to thts count ofi language usage, 
the rH>ur>s and referent pronouns are 
cpunted for males and fcmajjes and. In ^ 
the'ca^e where sex cannot be assigned^ 
a ^eutral category is tallied. tall/ 
sheet woi/ld then have four categories: 
male nouns and pronouns; female^ounis 
and pronouns; neutral nouns and pro* 
nouns; and neutral noun and pronoun re- 
ferents. Donlon et al_. (22) listed four 
Identification procedures for tallying: 

i 

(1) The noun Is rrvherently s^x-Ilnked, 
* e.g., mother, father,, sister, 

brother. ^ / 

■ 

(2) The noun Is found to have a sex- 
specific definition In the diction- 
ary, e,g., ballerina: I. a prlncl- 



TABLE 7 



Categories for Analysis of Gender-Balance (Jensen S Beck, 1978) 



■ 1^ 



I. "OCCUPATIONS 

Female Traditional: Nurse, Teacher, Librarian. Secretary, etc. 
Male Traditional: Laborer, Professional, Principal, Boss, etc. 
Female Non 'traditional: Professional, Laborer, Boss, Principal, etc. 
Male Son-traditional: NuT%c, Teacher, Secretary, etc. ' 



2. ACTIVITIES ^ 

Female Traditional: School, Playing with Dolls, Onlookers, Domestic Chores 
Male Traditior^l: School Sports, Games, other physical activities, adventurer, etc. 
Female Non-traditional: Sports, games, physical aaivity ^ 
Male Non-traditional: Domestic Chores, Child rearing, etc. 

3. ROLES * 
Active: Main character, problem solver, giving help/gift 

Passive: Secondary character, needing hclp^ recipient of help/gift < 

4. EMOTIONS 

Femaje Traditional: Fear, nurturance/tendemess, dependency, etc. 

Male Traditional: Aggression, courage, emotional strength, "strong silent type" 

Female, Male Non- Traditional: Cross-sex stereotypes 



/ 



The other type of analysis Is that of 
the frequency of usage of male and fe- 
-nale nouns and pronouns, in the test 
content.. As noted earlier, the counts 
and ratios of frequencies of nouns and 




pal female dancer In a ballet com- 
pany, 2. any female ballet dancer. 
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(3) The f>oun is a def I riJte female or male , 
/ hamc, e.g., Bill, Mary. ^' 

i^) The noun has a female or male pro- 
noun that refers to it, e.g., Pat 
; went to her class. 

In orti^r to det€>mlne the relative balance 
of male to female rcfcren9.es, the^ numbc^r 
9f actors or nOUns plus other woras such 
' as pronouns that refer toi^ them In the I tern 
are counted. Repet i t ions i|re also count- 
ed, V ^ 

Table 6 shows the summ^ry^ tifd c for the 
^\ analysis of eight educatlonwl^chlevc- 
fdcnt tests re^rted In 197^ '(Tittle, 
McCarthy, and Steck 1 er , 112) . This table has 
an extra column in it, since this study 
^ distinguished between generic nouns and 
pronouns (Included in the category called 
All ) and the category called Regular , 
which ej^c-Uixjcd the generic nouns and 
their pronoun reFtr«n ts . Tafcle 
a series of achievement 
ratios of male nouns ar 
female nouns and pronoL 
the ratio in the Ca U fid 
Tests, Level 3, grades 
that is, /our times as 



1 ists 
summary 



8 

testsi and 
d pronouns to 
ns; for examfple, 
rnia Achievement 
^ to 6 , was ^ : 1 , 
many male nouns 



and pronouns were used' as female nouns 
and pronouns. The analysis by a school 
of an individual test would show the 
ratios for each individual subtest in the 
battery as well as the total set of items 
in the test. 



This policy section has presented illus-' 
trations of the types of content and Ian" 
guage usage analysis that teachers, coun^ 
selors, and administrators may use to ex" 
amine educational achievement tests for 
overt sex bias. There is another' aspect 
to the analysis of content that should 
be mentioned. Earlier, a def i n i 1 1 on* of 
content validity was given in terms of 
the match of the test to the curriculum. 
Content validity is a particular concern 
if the school is using new curriculum 
materials related to women's studies in 
history, social studies, and literature, 
for -::ample. If a high school has in- 
cluded a section on the history of women 
In social studies or history courses 
then this should become a separate cate- 
gory for analysis in matchinq the test 



and local purrlculum. Tests may not ac- 
curately reflect the changing content of 
courses in schools and this meaAS paying 
special attjenti-on in reviewing the con* 
tent of tests for their appropriateness 
in assessing local curricula. 

ff test content does not accurately re- 
flect the local curriculum then test 
Scoifes of Students will reflect the mis- 
match. An example of the results of this 
type of effect was demonstrated In a study 
by Medley and Quirk (72) for the National 
Teacher— Exam! nat Ion. In this study the 
effect of changlnq content specifications 
to Include minority group history and 
cultural achievements was reflected In 
the relative performance of blacks and 
whites on the National Teacher Exam- 
ination. Thus, the original definition 
of content validity is Important. as local 
curricula begin to reflect changing per- 
spectives on i^omen and their cultural and 
histor I cal achievements. 

The next issue Is sex differences in per- 
formance on items. The section is par- 
ticularly concerned with ways In which 
statistical evidence of Item bias Is 
developed. 



IS^U^: ITEM BIAS AND CONSTRUCT VALIDITY 

The issue with which statistical evidence 
of item bias is concerned Is whether or 
not there are sex d i f f erenC?es" 1 n perfor- 
mance on test Items that do not appear to 
be- related to the content and construct 
•validity of the item. The main approach 
to providing statistical evidence Is based 
on the percent passing the item for dif* 
ferent groups,* to determine whether there 
are some I terns on which males do better 
than f ema 1 es and , converse 1 y , f ema 1 es do 
better than males. Plake, Hoover, and 
Loyd (8A) provided an example of a test 
item in mathematics where females and 
males performed differently. 



On the outside of the garage Mr. Nelson 
put a basketball goal 10 fe^et above 
the driveway. The goal was 2/3 the 
height of the garage. How many feet 
high was the garage? 



( 



2. 13 1/3 



3. 15 

k. (Not given) 
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TABLE 8 



Ratios of Male Noun and Pronoun Referents to Female Noun and Pronoun Referents — Educatl 
al Achievement Test Batheries (Tittle, et al . , 1974) 





TOTAL NO. 
OF TEST 

rrEMS 




NOUNS AND 


PRONOUNS 




Test - 


Items 


AU 


RefuUf 








RaSkO 


nMfnF^ Ratio 


California Achievement Tests 
Level 3 Form A (Gr. 4-36) 

T rvrl 4 P'nrm A iC\T 

Level 5 Form A (Gr 9^12) 


343 
349 


190/47 
84/46 
93/36 


'4.04 

1.83 
2.58 


190/47 

84/46 

93/:^ 
* 


4.04 
1.83 
2.58 


V-Omparauve vjuiaancc ano riaccnieni rrogram 
Form TPG (Gr. 13-14) 
Form UPGX3 & ©PGX4 (Gr. 13-14) 


391 
275 


127/9 

iZ.il J** 


14.11 


106/9 
111/33 


11.77 
3.36 


loua Tpsts of Basic Skills 
Form 6 (Gr. 3-8) 


1232 


1121/368 


• 

3.31 


1211/368 


3.29 


The Iowa Tests of Educational Development 
Form Y5 (Gr. 9-12) 


330 






219/195 


1.12 


Metropolitan Achievement Tests 

Primary I Form F (Gr. 1.5-2.4) . 
Primary II Form F (Gr. 2.5-3 4) 
ticmeniary rorm r j.j^^.y) 
Intermediate Form (Gr. 5.0-6.9)' 
Advanced Form F (Gr. 7.0-9.5) 


174 
257 
MM 
534 
524 


51/59 
137/86 

181/44 
198/51 


.86 
1.59 

J." J 

4.11 
3.88 


48/54 
137/86 

121/42 * 

178/44 

195/51 


.89 
1.59 
2.88 
4.05 
3.82 


Sequential Tests of Educational Progress 
acnes 11 rorm •♦a \\jt. j— 
Series 11 Form 3A (Gr. 6^9) 
Scnes II Form 2A (Gr. 9-12) 
Series II Form lA (Gr. 13-14) 


420 
420 
470 

320 


366/103 
443/150 
468/134 
448/32 


3.55 
2.95 
3.49 
14.00 


322/98 
408/149 
360/120 
390/32 


3.29 
2.74 
3.00 
12.19 


SRA Achievement Series 
Level 1-2 Form D (Gr. 1-2) 
Level 2-4 Form D (Gr. 2-4) 
\iu!lilevel Form D (Gr. 4-9) 


320 
276 
1070 


179/88 
333/241 
1513/231 


2.03 
1 38 

6.55 - 


179/88 
330/234 
1462/229 


2.03 
1.41 
6.38 


Stanford Early School Achievement Test 
Level I (Gr. K-I) 
Level II (Gr. 1) 


126 
259 


217/93 
192/168 


2.33 
1.14 


217/93 
190/168 


2.33 
1.13 


Stanford Achievement Test 

Primary I Form W (Gr. 1.5-2) 

Primary I Form X (Gr. 1.5-2) 

Primar> II Form W (Gr. 2-3) 

Primary II Form X (Gr 2-3) 

Interfnediate I Form W (Gr 4-5) 

Intermediate II Form W (Gr. 5-6) 

Advanced Form W (Gr. 7-9) i 

High School Basic Battery Form X (Gr. 9-12) 


251 
251 
409 
409 
540 
544 
532 
4-^^ 


134/53 
119/78 
209/89 
143/87 
221/83 
171/58 
181/46 
. 245/50 


2.52 
L53 
2.34 
J. 64 
2.66 
2.95 
3.93 
6.13 


123/51 
115/78 
192/87 
143/87 
198/71 
166/58 
157 46 
. 242/39 


2.41 
1.47 
2.20 
1.64 
2.78 
2.96 
3.41 
6.21 
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The percent of males passing the I tern 
successfully w^s S5.^; the percent of fe- 
males successfully passing the item was 
37. One can speculate as to why an item 
such as this* may show a difference in 
performance for girls and boys, but often 
there is no apparent reason for observed 
differences. There have been few studies 
that asted students to talk out loud 
during problem solving to identify what 
might be different approaches to a problem 
such as the one above. The item shown 
next was one in which the sex differences 
In performance were reversed, with more 
females answer I ng -correct 1 y than males. 



In the 1968 Olympic games, the winner 
of the men's 200 meter dash was timed 
in 19.8 seconds. The winner of the 
%<xnen's 200 meter dash was timed in 
22.5 seconds. The men's champion ran 
the 200 meter dash how many seconds 
faster? 



1 



2.7 

3.8 



3. 3,7 

A. (Not oiven) 



The percent of females answering the ques- 
tion correctly was 67-5 and the comparable 
percent for men was 55-8, From these 
examples it is clear that selection of 
items with different percents passing can 
affect the average scores of boys and 
girls on the total test score, since the 
total test score reflects the sum of the 
percent passing the individual items. 

The percents passing the i tems7 as il- 
lustrated above, are the data on which a 
number of different statistical approaches 
are based. The different approaches or 
methods that have been suggested for ex- 
amining items by group (ma 1 e- f ema 1 e ) 
differences try to identify I terns on whicn 
there are statistically significant dif- 
ferences. . The variety of techniques used 
all provide evidence on the construct 
validity of the items and test in a par- 
ticular sense: reducing the relationship 
between the status variable of sex and 
item or test performance. The proce- 
dures all involve computing item dif- 
ficulty data separately for males and 
females, and then using transformations, 
analysis of variance, chi square, or 
likelihood estimators to identify the 
"outliers" or items for which the two 
sexes are performing differently. These 



procedures are not detailed here but the 
interested reader wl 1 1 find various pro- 
cedures descr ibed in Cof fman, 11; Cardal 1 
and Cof fman, 9; Angoff, 2; Angoff and 
Ford, 3; Echternfch, 26; Schueneman, 99, 
97; Fishbein, 31; Green, 38; Green and 
Draper, 39; Merz, 73; Veale and Forman, 
llA; and Merz and Rudner, Ik, 

Another series of studies have used ana- 
lyses that control for ability level of 
groups. In general, the procedures clAd 
above (with the exception of Schueneman, 
99) do not attempt to provide a control 
for the Initial ability of groups. Fish- 
bein (32) also tried to take Into account^ 
the ability level of groups In looking at 
whether or not they performed differently 
on an item in a particular test. 

If groups of differing ability are used 
in studies, item difficulty Indices change. 
This problem has led a number of re- 
searchers to suggest methods based on 
latent trait models, models in which the * 
measure of an individikal's ability Is 
assumed to be independent of the distri- 
bution of abilities of examinees, Brlcf- 
lyj "A latent trait model specifies a re- 
lationship between observable examinee 
test performance and the unobservable 
traits or abilities assumed to underly 
performance on the test" {Hambleton and 
Cook, k\) , Major concepts , assumpt ions , 
limitations, and examples gf applications 
of latent trait models, including test 
bias, are given in a special issue of 
the Journal of Educational Measureme nt 

(57): 

Latent trait models assume that only a 
single ability (or latent trait) is 
measured , and that the item responses of 
a given examinee are statistically in- 
dependent. The item characteristic 
curve ( i cc) Is a ma thernat i ca 1 f unc 1 1 on 
that relates the probability of success 
on an item to the ability measured by 
the item set. The number of parameters 
required to descibe the Icc depends on 
the particular latent trait model-'the 
number of parameters is typically one, 
two, or three (Hambleton and Cook, ^l). 

The one-parameter model (the Rasch model) 
has been proposed and used to assess i tem 
bia$ by Ourovic (2^, 25) and V/right, Mead, 
and Draba (118), and Draba (23). The 
Rnfech nodel assumes that all items in a 
set have equal discriminating powe r , and 
thus the items vary only in terms of 
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difficulty. As Hambteton and Cook note 
(41, p. 83)» this is a veiry restrictive 
assumption and likely to be violated 
with most tests. On the other hand, the 
three-parameter model requires large num- 
bers of examinees to estimate parameters 
/and extensive computer time (Lord, 65). 

Ironson (52) lias compared four of the 
procedures proposed: the transformed item 
difficulty (Angoff), the 1 tern dl scr imin- 
atlon (point blserlal), the chl square 
(Schueneman) , and the Item characteristic 
curve (three-paramct^ model) method. The 
methods identify different items as bi- 
ased, In her &tudy, and show low to mod- 
erate correlations between methods. The 
Item discrimination method was least ref- 
lated to the^ other three mpdels. 

While there are no studies on which to 
base cofisensus or expert agreement of the 
extent to which the use of one method or 
another is preferable* It appears that 
the use of at least one of these pro- 
cedures, with a clear specification of the 
decision«-making rules for including Items 
^ In the test, would enable the test user 
to make a judgment as to whether or not 
the test is fair for a particular group 
of girls or women. The absence of such 
a procedure appears to be more important 
than the use of a particular procedure, 
until optimal procedures are Identified. 

In terms ^of assembling the total test, for 
norm- referenced achievement tests parti- 
cularly, only Diamond (20) has suggested 
a decision rule for selecting items. 
Diamond would reduce mean differences 
between subgroups by examining. subgroup 
differences when selecting items on the 
basis of pretext data. Differences be- 
tween the percentage of upper and lower 
groups selecting the correct^answer can be 
calculated separately for eadh subgroup. 
The difference between subgroups should 
. not average 5 percent in either direction 
for any given subtest, according to Dia- 
mond. Other recommendations for dealing 
with this part of the test development 
process have not been located.^ The de- 
cision rules used at this particular 
stage of test construction are another 
procedure that test users should look for 
in test manuals, and the use of these 
rules will help to define whether a test 
is fair to females and to males. 



EKLC 



Another type of Item bias msiy be more gen- 
eral Ized over a set of Items. In some 
instances It may be possible to Identify 
a particular Item response format or set 
of directions that have a dffferentlal 
effect for fenrples and males. A study of 
one particular Item alternative, "I don't 
know" as a response, was carried out by 
Sherman (100). She examined the use of 
the 'M don't know" alternative In the 
National Assessment of Educational Pro- 
gress. In the beginning of each MAEP 
administration, respondents were In- 
structed how to answer the excerclses and 
were shown a sample multiple-choice exccr- 
cise. A tape recording was played during 
each administration. The fol lowing In- 
structions concerning the uncertalntly or 
"I don't know" alternative were read: "If 
you don't know the answer to an excerclse, 
just fill In the oval next to i don't 
know." (Sherman, tOO, p. 2). After each 
multiple-choice exercise was read to the 
respondents, the announcer added, *'lf you 
do not know the answer, please mark the 
'I don't know' response." (p. 2) Sher- ^ 
man's Idea was that respondents choos^g 
the "I don' t""kinow" response could be^e- 
spondlng on the basis of uncertainty but 
also on the basis of personality variables, 
that some Individuals might be less self 
confident, rather than less knowledgeable. 
She hypothesized that this relationship 
might vary according to group membership. 
Testing her hypothesis, Sherman found 
that an adjustment for the use of the "I 
don't know" response by groups had a 
large impact on the sex di f ferences In 
science perforfnance. Sex differences 
at the three younger ^^ges tested by the 
NAEP were reduced by a regression analysis 
modification. The sex differences were 
virtually eliminated at the adult level 
over the 66 excerclses analyzed. As Sher- 
man noted, "Sex differences in correct 
response percentages for many of the ex- 
cerclses at the adult level can be ex*" 
plained almost completely by differences 
in usage of the "I don't know" alternative. 
Some exercises continued to show a clear 
advantage for one sex or another after the 
data modification but there are fewer show- 
ing the overwhelming male advantage^as de- 
picted in NAEP data." (100, p. 1^). 

The particular form of sex bias detected 
by Sherman would probably not be identi- ^ 
fied in the analysis of a set of i:tems by 
the types of procedures descr ibed^'abbve, 
where sex differences in per^rmance for 
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Individual Items are examlncd^. This Is 
a particular response format that would 
.affect all the I terns In a test, providing 
a consistent effect across all Items. The 
item bias procedures developed so far can- 
not Identify such a difference. Teacho^rs 
and admlhl stratoV'S should be alert to 
considering a variety of hypotheses for 
why one group may score differently than 
another where there Is no likely educa-* 
tlonal reason for group differences to 
appear. The construct valldljty of the 
achievement test Is poor If response for- 
mat or test directions affect groups 
differently. Another situation, not 
necessarily applicable to male or female 
performance, but perhaps to groups who 
more or less frequently tM<e tests, would 
be frequent changes in theSmanner in which 
test items are presented. Students who 
are more used to taking tests might tend 
to score higher on such a test than stu- 
dents who are not used to taking tests 
and responding to a variety of different 
directions. (See Schueneman, 98, for 
further examples of the effect of Item 
ordinal position and item format on sub- 
group differences \q performance.)' 

Pol jet; 

The research concerned wi'th the issue o/ 
statistical evidence for item bias suggests 
that teachers and school administrators 
need to formulate policy in relation to 
the review and selection of achievement 
tests for use in schools. 

There are several types of data tha^ re- 
viewers of educational achievement tests 
should find in tes t ' manual s . The most 
critical data will be evidence that item, 
analysis procedures have been ca^rried out 
to examine whether females and^inales per- 
for^-med differently on indificual Items in 
'the test. The second type of evidence is 
a description of the decisions used to 
include or exclude items^ where females 
and ma les s per form differently In try-out 
or pilot testing of items. There should 
also be a list of the Items and the per- 
xents of females and males passing each 
item for all ite^s included in the final 
version of the test. This listing of the 
percent passing the items should also show 
the frequency distribution of itemTpercent 
passing separately for females and males. 

Within cbntent specification limitations, 
adjustmei/its in the items selected should 
reflect 'roughtly comparable item diffi- 



culty or percent passings for females and 
males to minimize differences between the 
two groups In the final test distribution 
for comparabl e sub-groups. The procedures 
and the resulting data should be reported 
In a test manual to provide the test user 
with the evidence to make the decision 
that the test Is a faij test for females 
and males. However, despite the use of^^ 
these procedures In Item analysis and I tern 
selection, sex differences fn average per- 
formance may remain. 




ISSUE: TEST BIAS AND CONSTRUCT VALIDITY 

Where sex differences In average perfor- 
mance on educational achievement tests 
remain, \he test publisher and the test 
user need[^^ seek evidence of construct 
validity. That Is, does the test measure 
the same construct for each group or Is 
the test "biased" in the sense of measur- 
ing different constructs for each group? 

J\s mentioned earlier, construct validity 
Is needed since test users often make in- 
ferences beyond the actual content of the 
test items. Cole (12) suggests that a 
test interpreter rarely, if ever, remem- 
bers the pai>#icu1ar types of items on a 
test when interpreting children's scores-. 
On the score report there will only be 
the label "math concepts" or "science 
content" or "science process." She re- 
ports th^t on the basis on content cate- 
gorizations .of I terns on the scignce 
ac^vievement test of tjie International 
Association for the Eva luation''ft"P Edu- 
cational Achievement (I EA) sex differ- 
ences in science achievement have been 
reported: girls perform more poorly than 
boy^s on physTcs items and on "understand- 
Ing^jprtems. The construct interpretation 
that users tend to npake Is tliat boys 
achieved higher |eVels of understanding 
of science tharr girls, and sometimes the 
inference extends to the additional judg- 
ment that -girls do not have the capability 
for high levels of science. Cole reports 
a study by Carlson demonstrating that at 
lea^st a few of the "understanding" items 
involved sex-di f ferentiated practical 
experience. Boys did better on an "under- 
standing" item involving how to put 
batteries in a flashlight. Girls did 
better on an "understanding" item about 
how to place a jar under hot water to get 
the. lid off. Wherever sex differences 
remain in performance on educational 



achievement tests, publishers need to 
demonstrate, |nd test users In schools 
should require, evidence of construct 
validity. One type of evidence of con- 
struct validity Is to experimentally man- 
ipulate Item content. 

A study by McCarthy (71) experimentally 
manipulated the context of mathen^tics 
Items and showed that girls performed 
better on Items tha^t had a contest of 
••typical'* activities of women in our cul- 
ture.. Her study suggests that it is 
poss/ble to construct Item pools and de- 
velxsp a test on which we expect similar 
av*ra9e-]5^ormance of felttales and maples 
Jn mathematry^s or science achievement. 
Where performance differences remain for 
groups of females and males, the test 
publisher needs to examine other plausible 
rival hypotheses that may account for 
differences If ^he test is claimed to be 
f ai r to both sexes . 

A second type of study is a deiDOnstrat Ion 
. within Subgroups of females and males that 
similar patterns of correlations exist for 
the achievement test with a specified set 
.^of other measures or cri tcr I^' (such as 
aptitude measures). Ironson's study (52) 
suggests that this approach of correlating 
achievement and aptitude measures to ex- 
amine the patterns of intercorrejations 
separately by sex may be feas ibie, s ince 
she found fewer biased items in more ab- 
stract, le5§ achievement-oriented meas- 
ures (p.g., picture-number and letter 
groups). As noted ab^ve, the few studies 
of test construct validity and sex bias 
^ have examined tests of mathematics and 
have varied the familiarity of the con- 
text in which the mathematical process 
was embedded. TIjere is a particularly 
critical need to examine the construct 
validity of tests at the high school 
level in the areas of scTence and mathe- 
matics, areas that women "are not "tra- 
ditionally encouraged to view as "femin- 
ine occupations" or fields of study. 

Another approach can be tried, similar to . 
the System of Mu 1 1 1 cu 1 turil^Ol ura 1 i s t i c 
Assessment (SOMPA) Merce/ has developed 
for the Wise. PublUhers can develop 
measures descr ipt ivfe of each individuals 
relevant experience In mathematics, as well 
a? attitudes toward mathematics, as anoth- 
er approach to determining construct 
validity for both females^ and males. 
Hypotheses can be stated for the pre- 
dieted relationships among correlation; 



with achievement test scores, past "ex- 
perience" In mathematics, and attitudes 
/Coward mathematics. Specification and 
^%provIslon of evidence on these variabiles 
might give the user a "frame^of refer- 
ence" of other variables to determine 
that a test is sex-fair to individuals 
even though average differences in group 
performance may remain". 

If the test user Is given evidence to' 
determine that sex differences in test 
performance still remain after Item bias 
studies, careful item selection proce- 
dures, and studies of construct validity, 
what policies can be recomrnended for test 
Interpretation and counseling? 

Two issues can be briefly noted here and 
placed In the context of policy decisions. 
The first Is the issue of whether or not 
there should be separate sex norms-, com- 
bined sex norms, or, perhaps, experienced- 
reference norms If mean sex differences 
In performance remain on educational 
achievement tests. The secomh-issue is 
the relationship of test ^performance .and 
interpretation to course/ se 1 iet ion . 

Achievement tests do no£ now proylde sep- 
arate norms since thepe are few areas of 
achievement showing sex d i f f erences . In 
science and mathematics, key areas of con- 
cern fqr occupational desegregation, there 
may be^^ex differences In performance at 
the high school or col lege level. Ai^" 
these instances, the best policy may"^ be to 
continue with combined sex norms and to 
provide inservlce training to teachers and 
counselors, and additional interpretative 
materials to students and parents. Infor- 
mation that may help to place test scores 
in context. We can reasonably assume 
that we do not know whether any sex dif- 
ferences remaining are due to social or 
innate factors, and that In any event for 
educational purposes It Is more logical 
to assume that they are the products 
of cultural experiences, incluoing sex 
role socialization practices. 

Workshops for teachers and counselors can 
focus on^ test score Interpretation for 
individual students, pointing out the 
stereotypes sometimes held for female 
and male performance. Awareness of the 
importance of basic areas for lat^r career ' 
choice^ especially in mathematics and 
science, can be stresised. School adminis- 
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trators can examine sex differences In 
achievement test scores and course en- 
rollments wlthlr\ thefr own schools. At 
the high school level, arc there jiex 
differences Jn math and science achieve- 
ment for boys and girls? If there are 
differences In performance, «re there 
Indicators that attltydpi^f the staff 
differ for boys and girl s7\ Are different 
courses counseled for boys J^id girls? 
Are staff classroom behaviors equally 
responsive to the performance of boys 
and girls, or with girls and mathematics 
partlcuarly, are there specific steps 
taken In Instructional design to ensure 
that girls succeed and develop a positive 
attitude toward mathematics to the same 
extent the boys do? 

Assertive, affirmative, action Is needed 
for many girls who will not consider 
additional mathematics courses without 
suggest Ions support from teachers 
and counselors. The development of spe- 
cial college courses to assist women to 
overcome'fears of^^furtha'r education In 
mathematics Is ample evlldence of the need 
to work more Vff i rmatlvcly In the junior 
and senior high schools. 

I 

Teachers and counselors must show the same 
assertive action In assisting girls to 
continue work In the sciences. Courses 
in phyrics may be open to girls, but 
more Is required: girls must he actively 
recruited to pursue course work In the 
sciences and technical fields. Teachers 
and guidance staff can relate achievement 
test areas to occupational clusters and 
emphasize the basic importance of mathema- 
tics and science In leaving one's "options" 
open. ' Stereotypes and prejudices toward 
women in math, science, and the trade/ 
technical areas can be explicitly address- 
ed. Confronting the folkways and pre- 
judices can assist boys and girls to ex- 
amine their own attitudes toward the 
achievement of each sex in these areas. 

Teachers and counselors can also work with 
students and parents In reviewing test 
manuals and achievement tests for evidence 
of sex bias, to examine interpretive mate- 
rials for overt sexism, and to review their 
own interpretive actions with respect to 
individuals for sexism. Parents can be 
assisted to examine their own values in 
relationship to student achievement 
scores. Again, the key areas for explor- 
ation here are mathematics, science, and 
nontrad! tional occupations for both sexes. 



POLICY SUMMARY: EDUCATIONAL ACHIEVEMENT 
TESTS 

Sex bias In educational achievement testing 
can be examined at Several levels. First, 
there is the review for overt sexism. 
Second, Item analysis data and decision 
rules for selecting items can be examined 
In manuals provided by test publishers: 
Where mean sex differences In performance 
remain In educational achievement tests, 
review panels selecting tests, whether at 
local schools or for statewide assessment, 
need to determine whether the publisher 
has provided evidence on the construct 
validity of the achievement tests. Hate- 
rials that accompany the tests, test 
administration directions, all manuals, 
forms, and other Interpretive materials 
accompanying the educational achievement 
te'^t also need to be examined for overt 
sex bias and for their assistance to 
students^ parents , and teachers in under- 
standing the possible effects of , sex role 
socialization on student performance. In- 
servtce courses for teachers, counselors, 
and administrators will help to insure^ 
sex-fairness in presentation and interpreta- 
^ tlon of achievement test scores. All of 
these actions are directed toward assisting 
girls to experience success and eliminate 
sex stereotypes of achievement In any area, 
but particularly in mathematics and science, 
since these ire critical areas influencing 
later career choices and options. Career 
interest measures are also Influences on 
occupational choices. 




r 



25 



SEX BIAS AND CAREER-INTEREST MEASUREMENT 



OVERVIEW: OBJECTTVITY AND SUBJECTrVTTY 
IN MEASUREMENT 

Tests that assess an Individual 's Interest 
In different occupational areas or basic 
fields of interest comprise a small 
percentage of tests sold on an annual 
basis. However^ interest assessments are 
Included In several major college admis- 
sion programs and It Is estimated that 
3.5 million interest tetts are administer- 
ed annually (ill) IncludVng those given 
as part of college admissions testing. 

The interest measurement area provides 
a study in the nature of. scientific 
psychology and whether there is "objec- 
tivity" or "subjectivity" In Its empiri- 
cism. Two of the major Interest inventor- 
ies, the Strong Vocational Interest Blanks 
for men* and women and the Kuder Occupa- 
tional Interest Survey DD (KOIS) we^e 
empirically developed. Both of these i 
measures, until very recently, reported 
a separate set of occupational scales to 
females and males. The scales reported 
for women were based on a shial ler number 
of occupations and emphasized traditional 
women's occupations. 

The inventories were developed on an em- 
pirical basis, with little in the way. of a 
theoretical formulation to guide them^^ ' 
Theoretical formulation in this case ^ 
refers to a broader conception of the 
domain of occupations, career guidance, 
and the fundamental assumptions made 
about the occi^pat lonal world and its 
relationship to males and females. If 
equal opportunity and rights are of basic 
concern, the stricly empirical basis of 
developing vocational interest inventories 
was not adequate to meet these concerns. 
There was not a considerat ion of alter- 
native ways to assess interests a^cl to 
ensure that women were encouraged to 
consider choosing occupations not typical- 
ly thought to be open to' them. Both the 
Kuder and the Strong made it clear to 
women that there were separate occupation- 



al scales for men and women. There %#ere 
77 occupational scales for men and a total 
of 57 for women (which included 20 on the 
men's scale). There were 17 scales that 
overlapped and appeared In both the female 
and male occupational norms. The KOIS also 
presented college major scales for females 
and males. These scales were particularly 
discriminatory, since college major scales 
for women did not Include the following 
areas: law, business management, market* 
ing, finance, government, pre-medlcal , 
dentistry, and pharmacy. These items are 
noted as background, to give the status 
of Interest Inventories up to the early 
70s. 

In the 1970s pressurcs^f rom the profession- 
al guidance organ Tzat/i on s and the National 
Institute of Educatl.bn project which 
formulated Guidelines on Sex Bias and Sex 
Fairness In Career Interest Measurement 
were instrumental In encouraging some 
change. The major changes are noted below, 
but, as described In .th% policy sections, 
the changes are Incomplete thus far. And 
the changes In tests are only part of the 
changes required In the counseling system 
within which they are used. 



Types of Interest Inventor! 



^Therc>are ty»> types of scale Construction 
tradltiCfnally uSed for Interest Inventor- 
ies/. Occupat ional scales are based on 
the empirically determined relationship 
between the Interest expressed by the a 
taker of the interest Inventory and those 
of Individuals already employed in occupa- 
tions. "Two examples of this type of scale 
are the occupational' scales of the Strong- 
Campbell Interest Inventory (SCI l) and 
those of the Kuder Occupational Interest 

Purvey DD (KOIS). - Homogeneous scales , 
scales based on internal criteria, are 
developed through some form of clustering 
Uems — similar typ6s of activities for 
a job or interest areas, the theory of the 
test constructor, perhaps Sorting by 
judges, or i ntercorrelat ion of items and 
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scores. Tha responses of the test taker 
ere reported for sceles Internel to the 
Instrument. Example of Instruments with 
homogeneous scales are Hoi 1 and ' s Se 1 f-D I r- 
ected Search (SDS), the American College 
of Testing Program Interest Inventory 
(ACT), and the Ohio Vocational Interest 
Survey (OVIS). 

The two types of scale construction have 
led to different problems In examining 
sex bias. These problems are reflected 
In the NIE Guidelines for Assessment of 
Sex Bias and Sex-Fairness In Career - 
Interest Inventories (Diamond, 19). 
These guidelines are summarfzed here and 
the major Issues are discussed In more 
detail for particular guidelines. 

The NIE Guidelines 

lAs Diamond (19) described the process, 
'the working definition of sex^bias used 
in the development of the NIE Guidel Ines 
was : 

Within the context of career 
guidance, sex bias Is defined 
as any factor that might Influence 
a person to limit — or might cause 
others to limlt--his or her con- 
sideration of a career solely on 
the basts of gender. 

The N I E Guidelines have three major areas: 
the Inventory itself, technical informa- 
tion, and the Interpretive information 
that accompanies the inventory. The 
first area, the inventory itself , Is 
concerned with the "face validity" of a 
sex-fair instrument--overt sex bias and 
sex-role stereotyping. The technical 
information section includes Issues 
specific to |the homogeneous scales and 
tt^e occupational scales^ The Ifi jjprpret i ve 
Information section Is concerned v/ith 



materials for students and counselors, 
to reduce sex-role sterA)typing of occupa- 
tions, gnd to ^courage career exploration 
act i V 1 1 ies . ^ The NIE- Guidel Ines are 
briefly summai*f2ed ^s follows. 3 



(I) The Inventory Itself* 

a. The same form should be used 

for women and men unless It Is 

J* 

empirically shown that separate 
forms minimize bias. 

b. Scores should be given on all 
occupations and Interest areas 
for both women and men. 

c. Item pools at the Inventory 
and scale levels should reflect 
experiences and activities 
equally familiar to each sex. 

d. Occupational titles should be 
present . 

e. Use of the generic "he" should 
be el Imlrvated . 

*(2) Technical Information. 



TecfYhlcal manuals should des- 
cribe how the Inventory meets 
these gu idel ines . 

The rationale for separate 
scales should be given. 

The same vocational areas 
should be indicated for each 
sex even if It is empirically 
demonstrated that separate 
inventory forms are more effec- 
tive in minimizing sex bias. 

Sex composition of the criteri- 
on and norm groups should be 
described. 

Criterion and norm data should 
be updated every five years. 

The information ^on career op- 
tions distributions suggested 
for each sex should be provid- 
ed. 

The validity of interest inven- 
tories for minority groups 
should be i nves t i qated . 



d. 



e . 



uc 



Stebbins, Ames ^, and Rhodes (102) provide extended comments on the rationale for each 
guideline .and give examples of their purposes. Diamond (19) is a useful reference for 
the technical and research issues, as weVI as the research studies in Tittle and Zytow- 
ski (111). 
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(3) Intcrpratlve Information. 

a. Interpretive materials shcxjld 
point out that vocational In- 
terests and choices of women 
and men are Influenced by many 
environmental J and cultural 
factors, (ncljUdlng early so- 
cialization, ^sex-role expecta- 
tions, and home-versus-career 
conf 1 let, 

b. Orientation to the inventory 
..should encourage respondents 

to examine stereotypic sets 
toward activities and occupa- 
tions. ^ 

c. The users' manual should state 
\that all 1 jobs are appropriate 

for qualified persons of either 
/ sex an||j should attempt to dls- 
\ pel myths about women and men 
\ based on sex- role stireoj^ypes . 

d. Interpretive materials should 
encourage exploratory experi- 
ences in areas where interests 
have not had a chance to de- 
velop. 

e. Case st|^dies and examples 
should represent men and women 

' 1^ equal ly'and include examples 

of each in nonstereotyfiic 
ro 1 e s . 

A majority of guidelines apply to interest 
inventories and accompanying manuals re- 
gardless of whether the inventories have 
occupational scales or honK»geneouS scales. 
Individual Fssues and policy recommenda- 
ttons follow. 



ISSUE: FACE VALIDITY 

"Face*' validity is concerned with overt 
bias in occupational titli^ and sex-role 
stereotyping in interest i ni'entor ies . 
Guidelines la, d, and e and guidelines 
3a, b, c, and e are concerned with this 
issue. With most interest inventories, 
it is now possible to use the s'^ame form 
of the jinterest inventory for women and 
men. One exception is the Minnesota Voca- 
tional Interest Inventory (MV II ) . The 
MV II has been developed for occupations 
in the non-professional fields and is 



based entirely entirely on the Interests 
of men in these occupations. Under Title 
IX, this Inventory could not be used un- 
less a similar Inventory were available- 
for women. Its use in assisting only male 
students for vocational education program 
placements would be discr ^mlnatory against 
women. The publisher to date nas not 
provided a comparable form for women, nor 
has there been an attempt to revise or 
develop scales for the HVII. The adminis- 
tration of the MVII would also be prohibit- 
ed under the definition of sex discrimina- 
tion used In the Vocational Education Act* 
of 1976. 

Policy 

Thi-most important policy for school 
administrators and counselors Is the 
decision to develop a review procedure 
and ^ checlclist form for career-Interest 
measurement and accompanying guidance 
materials, derived from the guJc^llnes 
concerned with the "face validVty" of 
instruments and materials. The checlclists 
should Include the following Items: Is 
the same test form (set of Items) used for 
both^irls and boys? Earlier forms of 
tests that had separate forms for men and 
women should be discarded. Are all occupa- 
tional titles in gender-neutral terms? 
Terms that restricted occupations to males 
or females should be restated in such form 
as, for example, firefighter, letter car- 
rier, and flight attendant. Is the generic 
"he" eliminated from all test and interpre- 
tive materials? 

For interpretive information, the guide- 
lines in 3a, b, c, and e should be in the^ 
checklist for review: Are Interpretive 
materials provided that describe the pos- 
sible effects of early socialization, sex- 
role expectations, and home-versus-career 
conflict? How are the issues of cultural 
factors that influence the vocational 
interests and choices in women and men 
included? Are students and counselors 
encouraged to examine sex-role stereo- 
typed ideas about tha^ appropriate activi- 
ties anrf occupat Torre for females and males? 
Does the manual and the student interpre- 
tive material state that all jobs are 
appropriate for qualified persons of either 
sex? Are there facts presented about 
women in the world of work that dispel 
myjths about women? In sum, the ^instrument 
and its interpretive materials must take 
a positive, strong affirmative action 
position. 
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V 



XSSUSt SAME SCALES FOB FEMALES AND MALES 

Guideline lb, scores should be giveh on 
all occupations for both women and men^ 
applies particularly to the Interest Inven- 
tories wfth occupational scales. Both the 
sell and the KOIS, for example, use the 
same set of Items for the<baslc Inventory » 
yet the criterion or ocdupatlonal groups 
are constructed separately by sex. The 
Issue Is that women still Icnow that the 
occupational world Is differentiated by 
sex, since scales are normed or developed 
separately for females and males. On the 
sell there Is an occupational scale for 
physician (m) and physician (f) on the pro- 
file form for test results. These scales 
are constructed on the basis of responses 
of a group of male and female physicians, 
respectively. Response differences between 
the occupational group and the men- or 
women*^in-general groups are used to derive 
the scale scores. The profile reports 
f^th scores, asterisking the same-sex 
^core (I.e., an asterisk appears by the 
physician (f) for a girl's profile). This 
practice reinforces the stereotype of sex 
differences In occupations and Is presently- 
Justified only by Item data that show sex 
differences In respohse rates. For example, 
items such as the following show differ- 
ences (Stebbins^ Ames» and Rhodes, 102): 
"W(^uld you like -to race automobiles?" 



criterion group from a sample of rqen-ln- 
general or women-*ln-*general . In the 
example In Table 9 the responses of male 
race drivers are not very different from 
those of a representative sample of men 
(men- In-general ) . When a male responds 
*Mike" to the 'item, there Is little here 
to distinguish him from male 'Vace 
drivers.'* However^ when a woman responds 
positively, she has given an unusual 
response, one that differentiates her 
from women- In -general and that Is similar 
to female *'race drivers." This type of 
Item response means 'that the Item would 
appear on the feme le-normed scale for' 
race driver but the Item would not be 
used for the male-normed scale for race 
driver.^ 

The KOIS does not use men-In-genera4 and 
women-ln-general groups In constructing 
occupat lonal scales . Differences In 
occupational scales for men and women 
are based on the response differences 
between men and women In the same occupa- 
tion. Scores on cross-sex scales for the 
KOIS (same-named occupational scales for 1 
females and males) therefore reflect 
directly any sex differences In response. 
On the SCM however, with the use of 
the In-general groups for scale construc- 
tion, the more "traditional*' a woman's 
interests are, the less ITkely she is to 



TABLE 9 



Response Rates for an* SCI J Item 



Response 
Like 

Indifferent 
Dislike 

Like 

Indifferent 
Dislike 



Female race drivers 

Women-in-generai 

S09c 



Male race driver^ 
95% 

09r 

Men-irx'general 

209c 
59f 



Occupational s'cales on the SCJ 1 are con- 
structed by examining the differences in 
responses for members of an occupational 



respond the way the nen-in-general group 
does, and therefore the higher her score 
on male-normed scales Is likely to be 



Johansson and Harmon (55), Campbell (8), Hansen (^2), Webber and Harmon (ll6) provide 
further example of analyzing the SC I I for sex differences and the implications for 
occupational scale construction. 
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(for traditionally fom«l« occupations). 
Conversaly, on tha KO\S, woman will often 
got lower (Lambda) score) on cross-sex ^ 
norms than on their own (same sex) norms. 

Items currently used In Instruments 
occupational scales were not examined 
bias when originally constructed, 
the I terns ware not selected to be 
'ityy des I rabi e*' by a generaT sample of 
femlTrs 'iTfid'' Aal«s> . 1^ the absence of 
criterion f ori I tem selection, the 
development of new occupational scales Is 
limited to ^tarns that are already In the 
Item pool and that vary on this character- 
istic. As one example, Johansson and 
Harmon (55) examiofd Items common to the 
earlier Strong Vocational Interest Blank 
(SVIB) for females and males--two separate 
forms--for 1^ occupational criterion 
groups. They found between 15 percent 
to 21 percent of these Items represented 
sex-stereotypic responses only and hence 
were not valid in differentiating occupa* 
tlonally related differences. 

A study by Harmon and Conrpe (A6) examined 
the sex stereotyping of occupat lonal titles 
versus occupational activities. Interest 
Inventories differ in their use of the 
titles or activities of occupations In 
,.*l terns. The Harmon and Con roe study pro- 
vides some evidence that occupational 
titles are perceived more stereotypical ly 
than activities. That Is, "doing research 
work" may not be perceived as stereotypi- 
cal ly as "scientific research worker." 

As these studies suggest, the ultimate 
goal for the Interest inventories Is a 
set of scales that do not incorporate 
sex differences. This goal Is stated 
In guideline 1c, item pools should reflect 
experiences and activities equally famil- 
iar to each sex, and is not presently met 
for the SCI I and other measures using 
occupational scales. (The guideline has 
been met with some homogeneous scales 
described below). As a result, the test 
user must carefully examine the Interpre- 
tive materials and the guidance setting 
for steps taken to counter the sex stereo- 
typing of occupations that remains In the 
Instruments". The different norms (m/le 
and female) can be used positively to 



i n i 1 1 a te '«ll scusf ton on sex-stereotyping 

of occupations and activities If counselors 

so des I re. 

Guideline Ic, however, has been largely met 
vlth the I tern pool for one of the Interest 
Inventories with homogeneous scales--an 
experimental form of the ACT, the Unisex 
Interest Inventory (Unl II). Rayman (90) 
examine the ACT IV I tarns for sex differ-- 
ences, had sex-balancad Items written 
(i.e.. Items 1 1 kaly to jgnthlbii^lO percent 
or less difference In "like" responses 
between the sexes) , and constructed the 
Uni II. This Instrument was administered 
along with the ACT IV to 3.000 College- 
bound students. The results showed the 
average differences between females and 
males (percentage of "like" responses) 
were much smaller for the Unl II than for 
the ACT IV. There were no significant 
differences between raw score means for 
the sexes In the Realistic, Artistic, and 
Convent I ona I Sea les ; stat I st I ca 1 (bu t 
not practical) differences were found 
for the Investigative and Enterprising 
Scales. The sex balance was least well- 
achieved for the Social Scale. Hanson 
and Rayman i^k) pursued this development 
and examined validity-related analyses. 
They conclued that the sex-balanced 
scales were generally valid, as measured 
against the ACT IV. 

Policy 

Since scores on all occupational scales are 
currently reported to all users, the policy 
question focuses on Interpretat Ion, Hdw 
should counselors and students work with 
all the scores presented to students? 
Similarly for homogeneous scales , how 
should counselors and students view the 
same-sex and comblned*sex (or opposite 
sex) norms reported . Special i nterpret i ve 
material Is requ I red, and In-service 
workshops are needed for teachers and 
counselors to explore the differences 
between the female and male scales and 
norms reporte^ to i nd i vidua Is. 

Admin istra1x)rs and counselors should care^- 
fully revifcw the reports received by ^gjt 
students for the inclusion of discussion 
of the two\ets of norms, of the likely 



Johnson (56) and Lunneborg (66) provide further examples and discussiof^ of the issue of 
cross-sex norm interpretation. Johnson has recommended the el imi nat ion pf the men-in 
general and women- i n-gene ral groups in the construction of SC I I scales. / Tittle and Den- 
^-ker (no) <;;ioi.' this effect on the KOIS. ^ 
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influence of lex-role socialization on 
interests, and their relationship for 
the particular boy or girl receiving the 
results. Both written materials anci 
i ntereact ions with counselor^ are t hr brst 
pol icy for ^ichools. 



ISSUE: CAREER OPTIONS SUGGESTED TO 
FEMALES AND HALES 

Guideline 2f Is concerned with another 
Issue of the hornogeneous scalcs--the 
distribution of career options suggested 
for each sex. For example, are careers in 
the Interest area of science suggested in 
the same proportion {overall frequency) 
to both boys and girls? Ideally, according 
to one definition of sex-fair testing^, 
the proportion should be the same. Hanson 
and Rayman nave demonstrated^gh^t the same 
raw score means can be achieved using sex- 
balanced items. However, with the usual 
unbalanced item pools, this equality does 
not occur and therefore raw scores are 
"norraed** (adjusted to wl thin-sex means 
and standard deviations, and placed on 
a^ scale with the same standard deviation 
for each interest area). This procedure 
highlights another controversy since 
unequal distributions of suggested options 
result for the sexes unless same-sex norms 
are used. 

Hanson, Noeth, and^Prediger (^5) examined 
four ways of reporting scores: interest 
profiles based on (a) raw scores, (b) 
combine-sex norms, (c) same-sex norms, 
or (d) opposite-sex norms. Their samples 
were tested with either the VIP or ACT 
(1970 or 1972) and followed up in college 
in 1975. They concluded that same-sex 
norms provided results showing criterion- 
related validity as high as or higher than 
the other procedures, and same-sex norms 
offered the additional advantage of sug- 
gesting similar vocational options to 



friHiales and males ((jultleilne 2 f ) . (Si*e 
« I %o Prffdiger and Hanson, 86, for similar 
conclusions.) Where raw score distribu- 
tions for Interest areas are not the same 
tor females and males, same-sex norms are 
required to achieve this goal. 

Gottfredson and Hoi lanrf (36) presented a 
study In which the use of the SOS raw 
scores appeared to be more efficient 
predictors of se I f -reported occupational 
choices (t- to B-^year follow-up) than 
sex-specific norms. Holland (^♦9) has 
pointed out that Interest inventories have 
two pruposes--exploratlon and predlctlon*- 
and that some may function better than 
others for each purpose, as part of the 
Justification for using raw scores with 
the SDS, Predlger and Cole (85) argued 
strongly for a different criterion and 
examined the relationship between sex- 
role socialization, employment data, and 
interest i nventor les. 

Policy 

Although not specifically taken up as an 
issue In the NIE Guidel Ines (except In 
guideline 2f j , this issue may be resolved 
in policy and practice by requiring that 
both types of information are reported 
to the Student in vocational guidance. 
That is, If bath the same-sex and opposite- 
sex norms (or both raw scores and normed 
scores) on any homogeneous scales 4re 
reported, students can question and explore 
whether response differences are due to 
sex-role socialization and limited experi- 
ence or to genuine difference^ in their 
i n teres ts as ind i v Idual s . Recent recom- 
mendatlons of the Office of Civil Rights 
(OCR) and the Association for Measurement 
and Evaluation in Guidance (AHEG) Sex 
Bias Commission include prov Id i ng, scores 
on both sets of norms for everyone. This 
permits individuals to determine how they 
rank with others of their own sex exposed 
to similar socialization experiences and 
how they rank with individuals of the 
oppos ite sex. ^ 



Prediger and Hanson (in Diamond, I9) defined sex resf^rl^t4veness as evidence of sex bias: 
An interest inventory Is sex- res t r i ct i ve to the degree thatfhe distribution of career 
options suggested to males and females as a result of the application of scoring or inter- 
pretation used or advocated by the publisher is not equivalent for the two sexes. Con- 
versely, an interest inventory is not sex- res t r i c t i ve if each career option covered by the 
inventory is suggested to similar proportions of males and females . A sex- res t r i ct i ve in- 
ventory can be considered to be sex-biased unless the publisher demonstrates that sex- 
res t r ict iveness is a necessary concomitant of validity. 

E*E. Diamond, personal communication, February 1378. 
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Tor the SUS^ which report* only raw %core* 
for each fecale, there Is no easy path to 
a Judgment of equal access to occupational 
programs. The d I s t r I bu tu I ons of sugyc^totl 
areas for career exploration do differ by 
sex (reflecting the current occupational 
segregation patterns). The Holland scales 
can be Interpreted as reflecting socialize' 
tlon practices^ and women tend to score 
highest on the Social and Artistic scales 
and very low oh the Realistic scale. The 
pattern of sex differences does not meet 
definitions of Sex-falrneSS In counseling. 
A sex^fali* policy would be to provide 
extensive supplementary Interpretive mate- 
rials for the student and career guidance 
staff; the SDS should not be used by 
Itself In sex-fair career guidance and 
counse I log . 



available. A5 appropriate, counselors 
ne«d to consider whether culture-specific 
experiences will limit women's opportuni- 
ties. 

Similar cautions apply to the use of In- 
terest inventories with mature women. 
Although there Is some evidence (Denker 
and Tittle, 17) that older wofrten view 
Interest Inventory results as appropriate, 
there has not been systematic work looking 
at I tern responses for this group of women. 
Again, counselors will need to use Interest 
Inventories to explore options, not close 
them. 



ISSUE: THE PURPOSE OF INTEREST INVENTORIES 
AND VALIDITY 



ISSUE AND ipLICY: MINORITY AND MATURE 
WOMEN AND INTEREST INVENTORIES 

The last technical guideline is one for 
which I imi tedy i nformat Ion exists; guide- 
line 2g recommends the Investigation of 
the validity of Interest Inventories for 
minority groups. Gump and Rivers i^O) 
reviewed the litereture for validity 
studies for black women and found little 
direct evidence of validity,^ They con- 
cluded that the validity of current inven- 
tories, wtth occupational scales construc- 
ted on predominately white samples, was 
questionable. Lamb (62), however, studied 
the validity of the ACT Interest Inventory 
for classifying students into educational 
major groups, for both females and males 
of five ethnic groups. Lamb found the 
structure of Interests was comparable 
across the white and minority samples, 
with the exception of Native American 
(Indian) males. The accuracy of classifica- 
tion was comparable for most minority 
g^roups. Lamb's sample consisted of 
college seniors, however, and It is not 
clear to what extent these findings would 
generalize to the secondary level. Ad- 
ministrators and counselors need to look 
to the interpretive materials and career 
guidance activities to make a determina- 
tion of sex-fairness for minority women, 
since technical data to support sex-fair- 
ness for minority group7women are not 



An Issue of concern Is the type of validity 
emphas i zed for I n teres t 1 nventor les . 
Holland (^9), as mentioned i*arller, pointed 
out that interest inventories may have 
two purposes — exploration and prediction. 
Prediction is based on a model of human 
behavior that says from one point of time 
we can predict a future event. Such as 
occupational entry and satisfaction. Im- 
plicit In this model for prediction Is the 
assumption that there Is stability In 
behavior. For women, this Is the wrong 
assumption at present* The current empha- 
sis in intervention programs and counseling 
is to assist women to explore other than 
traditional views of occupations. ThK 
emphasis Is likely to Insure a more '*un- 
stable*' or less predictable set of patterns 
of career choices for a period of time. 
Taking this point of view, it Is more 
appropriate to emphasize what may be called 
the exploration validity of Interest In- 
ventories, rather than predictive validity. 
The effect of Interest Inventories should 
be to provide satisfactory exploration 
validity. This term Is not well defined 
yet (Super andf Hall, 105) but. can be 
tentatively defined as the extent to which 
the career interest Inventory is useful in 
stimulating the student to explore- -to 
seek Information about occupations new 
to the student and to try new activities 
that may be related to career choice. The 
effects of career-interest inventories 
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Picou and Campbell contf-^ins a series of papers on minority groups but again lacks data 
on interesjtr-^i-BiVentor ies . 
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focuied upon h»r«, •long with *ny Inter- 
pretive materials or guidance activities 
provided with the Inventory resultt, are 
stimuli to exploration. (Guideline 
3d suggests that Interpretive material 
should encouraye exploratory experiences 
In areas where interests have not had a 
chance to develop.) 

Recent research results suggest that coun- 
selors should focus on these effects, and 
that test publishers and career guidance 
staff >hould be designing intisrpret Ive 
material for Interest Inventory results 
that encourage more career exploration. 
Holland (^8, ^9) and Holland, Takal, Gott- 
frcdson, and Hanau (50) have surmiorizcd 
exploration-related validity studies (using 
experimental designs and the SD5) for the 
number ^d types of occupations considered 
r bi^f high school level clients that support 
the use of the SDS for exploration. The 
Holland ct aj^. (50) study used high school 
girls only (N-252) . 

Cooper (1^) compared other materials, using 
a control group design. She examined the 
effects of the SCI I , a non-9Cj<l5t Vocation- 
al Card Sort (VCS) (Dewey, l8) and auxilia- 
ry materials designed to make women respond- 
ents aware of myths and realities of women 
In the world of work. For her sample of 
college women a few differences were found. 
The VCS was more effective in broadening 
career options and increasing the frequency 
with which women students read occupational 
i nforma t ion . 

Zytowtki (119) assessed the effects of 
giving KOIS results to high school juniors 
and seniors (N-lOO). Both boys and girls 
were In the sample. There was an Increase 
in accuracy of self-ranks on expected order 
of scortng on a set of selected occupations. 
Ther^ ylas no Jncrease in self-reports of 
infopm t lon-saak i ng behavior as a result of 
recewng KOIS profiles and Interpretation, 
in co«rast to studies of the SDS. 

A study by Prediger and McClure (8?) wsed 
an intervention with ninth and twelfth 
grade high school girls to try to encourage 
exploration of careers in science and 
technology. The interventions Included, 
for ninth graders, a career interest in- 
ventory (ACT-VIP) and group discussions of 
career planning. The experimental group 
reported more career exploration activity 
(but not specific to science and tech- 
nology). The twelfth grade col legc-boun(^ 
girls were mailed a booklet on careers for 
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w(jmen in science and toch/>ology designed 
to encourage women through role models 
and discussion of myths on women's careers. 
Th« Intervention was not successful In 
terms of the outcome measures used. 

J Takai and Holland (106) have also examined 
* the effects of the VCS, the SOS, and tha 
Vocational Exploration and Insight Kit 
(VEIK), a combination of the VCS, the SDS, 
and a*p'«n for exploration activities. In 
a sample of 2^1 high school glrU there 
were no significant differences between 
the effects of the VCS and the SDS on the 
var iety-of-occupet ions criterion, and the 
SDS was more effective than the combinetion 
treatment of VEIK. No significant differ- 
ences were obtained op a satisfaction scale 
criterion. Other criteria examined but not 
used in the analysis were pre-post measures 
on the number of occupations considered, 
satisfaction with choice, and the variety 
of information-seeking activities (self- 
reported). A no-treatment group was not 
included in the design. Oliver (8l) sum- 
marized research on modes of test inter- 
pretation and listed additional criteria 
such as accurac^V of self knowledge, cer- 
tainty of choice, and realism of choice. 
These criteria may be useful in addition 
to the exploration-related criteria (also 
reported in Oliver's study). 

In these series of studies different 
criteria for exploration validity have 
been used. Counselors and administrators 
may find it helpful to determine which 
ones are most important as outcomes for 
students in their schools. 

Policy 

There are several policy Implications for 
career guidance based on the studies of 
the effects of Interest inventories in 
increasing career explora^tXon . The first 
is that this changing emphasis in career 
guidance should be considered In order 
to determine its relevance in the local 
school. If career exploration is seen as 
the primary goal in using interest inven- 
tories and other career materials, ad- 
ministrators and counselors need to deter- 
mine their own criteria for this gait. 
Counselors may want to establish proce- 
dures' checklists of the activities, in 
addition to interest inventories , that are 
available to students to promote career 
exploration. These checklists would assist 
students and counselors to know the re- 
soL'-ces available locally to find informa- 
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thr fixture of '.prt lfl< 0( i u |>4 t I on ^ , nir%r 

iJir(kH%in (.in hr ij%riJ to pfovlJr 
ti Iter Irt tor ilr t f r m I 1 1 I hf» c f f <»c t I venr ^ % 
of local rf(ort\ to lncr^a^^ car«rr cxplof- 
•u Ion for both ylrli ami tniy^. Th« frr- 
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Tr»e Muiiirs thus for on the rf frets of 
intrrrsl invrntor irs suij»jrst tfi.it 00 t hr i r 
t^wn , without mj pf) I rmr n t .1 r y m.i 1 r i I , thry 
,\r r not «1S fffrttivr In s I i mu I .i t 1 fuj rH[)lor 
.jri<in .IS they nnijht l>f . Thr rrsr.ireh 
•. finiirs .Iff .list) liniitetl, with the r-xcrp- 
t i on of Tytowski'*. (Iiy) study, in focusini 
;tK)ff* rfL^*nrly oh s drnples of c) i r 1 s only, 
•.o th.1t thr Lompardtive r f f re t i vcnr s s of 
f'K p 1 or«i t i 00 ~ itr s i {|nrtl activities for frtruilrN 
.ind ma Irs cdnaot be examined. And the 
c-f fee t V i rnr ss of intrrcst-^ inventories and 
.Ktivftirs in encouraging exploration of 
non - r rati i t i ona 1 occupations for botfi girls 
«ind hoys has not been determined. 



ISSUL: COUNSt:iADR USE OF I.\'ri:REST 
INVENTORIES 

One of the issues in sex bias and interest 
inventories h^is been a concern with the 
counselor's use of interest Inventories 
and possible counselor bias in counseling 
females and males. Most of the research 
on sex bias and interest inventories 
has examined the inventories themselves, 
their technical development, and inter- 
pretive materials. To date, there have 
not been studies that examined whether 
counselors interpreted interest inventory 
results differently according to a client's 
sex. Oliver (81) evaluated career coun- 
sel i ng ou tcomcs for three types of test 
interpretation. Nipwever , there was no 
analysis of possible differential treat- 
ment effects by sex of the client. 



There are related stfljjyes of counselor 
attitudes, howeve r . ^JUbse studies have 
exa^.ined the question of whether knowledge 



of A pni -.on ' \ \rx will «f f r( i the fdm a 
tlofirti and ex (. up;i t I ona 1 a x l 0 t I Or> *. , 
rv.iluatlon, or trr4(n>rnt% prDvldrd by a 
» our^nr I o< H4 r Wiiy and t. I ( ^ / ) f><»vr 
irvlrwed t t>r rvldrtu n i>n lomi'irli)! .1 1 t I ■ 
lodr% and pi^n-vlblr nrn t»la% rrirarch 
f I rid I nip arc Ion I r ad ( l tory , but Included 
in thr llrnltnd rr^ioar ch f hidings avallablti 
«ir e %rveral *ktudlo% that provide a cau%« 
for concern. Tlie itudlc^ ^u^J(Je^t the 
iirrd for a policy related to counseling 
and Intercut I nventor lr\. In one iludy 
of vocational counsel in^, Tf><)*na% and 
Stewart (107) examined cou^elor attitudes 
tov^ard career goals of women with what 
they clas'.ified dovlato (traditionally 
masculine) career go^ils versus conforming 
ycwi I % (education as a career, for ex- 
ample). The results of their study were 
tfiat wixDcn who were perceived to fiave 
non- t rad I t I on« 1 0[ t nappropr i a t r ca rrrr 
(joa 1 s were Judged I nerd of further 
counse I in(^ . 

Another stud-V by Schlossberg .ind Tirtfo- 
f esa ( 96) <rxam i ned t he at t i tudes of 
counselors in traininc). Thry arranged 
interviews between counselor trainees and 
a coached female counselee for a counsel- 
ing practlcum. During the counseling 
session, the counselee informed the coun- 
selor she was a transfer student to the 
university, she was entering her junior 
year in college, and could not decide 
whether to enter the field of engineering, 
a "masculine" occupation, or enter the 
field of education, a "feminine" occupa- 
tion. The interviews were tape recorded 
and scored for s ta temen t s 1 nd i cat i ng sex 
bias. 

Statements by the counselor were consider- 
ed biased against the woman when she ex- 
pressed interest in the masculine field 
and the counselor rejected her interest in 
favor of the feminine vocation. The coun- 
selor's statement was considered biased 
for the female counselee when she ex- 
pressed interest in the masculine occupa- 
tion and the counselor supported or rein- 
forced the positive interest. Examples of 
the statements classified as r*egative bias 
(NB) and positive bias (PB) were: 

Marriage and Family - Family attachment 

(NB) Would your husband resent your 
being an engineer? 

(PB) Being an engineer would not inter- 
fere with your beco^iinn ^irried. 
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(NB) lh« ciKirto wivrk In eiiglftaaf tinj 1% 

y#ry dlfftLull. 
(PB) Your c latter k up lu now %^ow» 

t^«t you wouUi lit) w« 1 1 «n 

•ng 1 n«e r , 

Working Good It Ion % ~ Wt iTO, with wUtnn . 
what kind! of work, and/or und«r what 
condtt font work !i dona 

(NB) Eng I naar i ng . . I t \% vary, you 

know, lachnical, and very, t couhl 
u%« tha term "unpeopled." 

(PB) You could work at a related p*i,r 
a% an ertg i nrer . 

HanctjHn<> Occupation - I dent if I cat ion 
of occupation as mascu I ioc 

(NB) You nornvtlly think of this a\ *i 

man ' s field. 
(PB) There no such thlru; as a man ' % 

world anymore. 

Counselors displayed more bias (used morr 
negative bias statements) against fcmjilr-. 
entering a so-called "masculine" occupa- 
tion than for females entering «i so-calleJ 
"feminine" occupation. AKo, feminlnr 
counselors displayed as much bias against 
females as their male counterparts. These' 
studies (although contradictory studies 
exist in the research literature) are we I 1 - 
supported by Informal statements from coun- 
selors, parents, and students. While sex 
bias In counselor attitudes may not be 
extensively documented in the research 
literature, there Is sufficient concern 
that counselors, teachers, and administra- 
tors should be aware of its possibility 
eind local policy should ensure that coun^ 
sel } ng is scx-fa^ r . 

Policy 

The policies suggested here arc that local 
schools provide a career guidance staff 
checklist to insure regularized procedures 
for counseling students and countering sex 
bias in existing tests, interpretative 
materials, and counselor behavior. Stu- 
dents should be given a copy of the chcci<- 
list also. Categories for such a check- 
list have been^suggested by Schiffer (95). 
Schiffer has grouped the major categories 
for such a checklist as follows: linits 
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of (o^i*^ full r aruje of larfi^r iruiUtit. 
t#t( fe^ulli. anti toijnteltn^ tiaLV-ii() for 

* t ud0n I \ %oeklnij to try non - I r ad I t I ruia I 
joba. rtir^cs ia(eiji)rloi mi9 briefly de 

\i I I * r » I ( • r I ( iw 

• L Iml t 1 nf ihr 1 c^I 

SihIffer ^ugga^tt developing a ilatemant 
on teH-bia% In (ha career interett i n ft ru- 
men t . Problems of tex^blan that ara 
present in (he exl«tlny Inventorlet 
ihould be explained (o the ttudent. An 
explanation will help to atture blat-free 
couhse ling by encourag I ng ttudents to atk 
quest loni about alternatives. Secondly, 
there ihould be a statement on tex stereo- 
typing of occupations in our culture. 
The vocat lonal Interests and choices of 
mef) and wcxnen are influenced by many 
efw I ronmenta 1 and cultrual factors. Includ- 
ing early socialisation, traditional sex- 
ro 1 r r xpec t a t i ons of toe 1 e ty , conf 1 I cl s of 
home and family versus career, and the 
cxpcrlrncrs of typical women and men at 
ff>embors of various etfinic and social class 
groups. Discussion of how these factors 
may influence career choice and Interests 
should assist students to understand 
limits In the test scores and to rrwre 
rr*id i I y evaluate a large range of occupa- 
tional choices. Counselors should also 
make sure that sludenls understand the 
distinction between measures of Interests 
and ability, and the difference between 
Interest and. current knowledge about a 
profession. Such counseling will help 
to assure that students who have not ob- 
tained specific knowledge or who have had 
limited personal experience because of 
their sex will be able to pursue or con - 
sidcr more non - t rad 1 t i ona 1 interests. 

e Providing a full range of career c ho I cea 

Particularly for occupational scales, or 
other scales developed with separate sex 
norms, students should be given their 
results on both fema le and male norms or 
scales. This policy is in agreement with 
the N i E Guide 1 i nes , with Title IX inter- 
pretations, and is followed by most test 
SCO ring services. In presenting norms 
or scales for both sexes,' counselors h^ive 
the opportunity again to discuss the in- 
fluences of past experience and sex-role 
socialization as they may affect student 
results on an interest inventory. Coun- 
selor should exa-^ine clusters of career 
choices, since the full range of occupa- 
tions 'or nen and worsen are not available. 
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cral interest invenjtdr les, and the limited 
or different occupations scaled for men. 
and women, Cole recommends that Interest 
Inventory scales be used only to locate 
a woman Interests on the circular 
structure of a domafn of occupations. 
Lists of both 'Nnen's*' and '^wmcn•s" 
occupations that relate to that IpcatFon 
should then be used. The impHcatlons 
are that Interpretative materials rela- 
ting interest Inventories to a broad 
domain of occupations are an addition to 
career inventories that may lessen the 
effect of "'biased" item pools and separate 
crtterlon groups for occupations. 

Another technique to be used is to discuss 
a student's score in terms of a ranlcing of 
Interests rather than on the basis of the 
absolute value or magnitude of the score. 

# Attention to test results 

Counselors should stress student evalua- 
tion of career interest inventories in 
light of the student's own sense about 
career and activity/ Interests. If the 
test does not confirm a student's Interest 
or seems more nariow than those interests, 
futher exploration of careers or job pos- 
sibilities are necessary. Counselors 
should also encourage students to seeic 
other experiences. For students who score 
high on sex*tradi t lonal occupations or a 
narrow range of occupations, counselors 
may need to examine whether students have 
had any unusual experiences or should 
seeic other experiences to suggest alter- 
native careers that the student may wish 
to explore. 

# Counseling back-up for students seeking 
to try non-traditional Jobs \ 

This' is a critical category, since students 
who express interest In non-traditional 
jobs> whether they are female or male, may 
experience pressure to conform to more 
traditional jobs (for their sex) from 
peers, parents, teachers, or others with 
whom they may discuss their career plans. 
Counselors can assist and support students 
with non-traditional interests by stress- 
ing the current emphasis on affirmative 
action obligations for Tii ring; and promo- 
tion of woemn and men in non-traditiorfal 
jobs. Non-discrimination also applies 
in selection at institutions of higher 
educritton and financiaV support , areas in 



tlon, 

if a school develops a checlclist for trans- 
mitting the information described above to 
students, and counselors expand upon the 
checklist in talking to individual students, 
the school will have actively met Title IX 
requirements and will have taken a major 
step toward sex-fair counseling. 

Another set of policy recommendations, 
which elaborate the suggestions above, 
have been suggested by Cook (cited in 
Schiffer, 95). These policy statements 
are suggested for Inclusion in a checklist 
for counselors to monitor their own 
behavior. The checkl Ists may contain 
policies such as the following: 

(1) Counselors are equally available to 
male and female students on request. 

(2) Male and female students or potential 
students are referred to counselors In 
approximately equal^ numbers. 

(3) Counselors recommend programs and 
courses without regard to the sex 
of the Inquiring students. 

(4) Career information materials have 
been excluded from counseling and 
gyidance programs when they contain 
sex bias and sex-role stereotyping. 

C5) Career cousellng programs provide 
role models of men and women in a 
variety of jobs and occupations 
{including those non-tradi tlonal to 
their sex) 

(6) Hen and women are equally represented 
on the counseling staff. 



U) Interest inventories and other , 
appraisal Instruments that contain 
sex bfas have been eliminated from 
use or steps have been taken to re- 
duce the ill effects of their bias 
on occupational aspiration and occupa* 

tional choice. 

*^ 

(8) As great an emphasis is placed on the 
career choices and career decisions 
of women as on the same decisions of 
men. 

(9) Women and men students are provided • 
information about their rights to 
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opportunities under the law and' are 
provided simulated activities for 
dealing with sexism and^scrlm^lTia-* 
tlon^ \ 

(10) Programs are planned and conductja^for 
parents that assist them In wo/King 
with their sons and daughters on 
career decfslons, especially with 
respect to sex. bias and sex-^role 
stereotyping they may encounter. 

(11) Guidance and counseling and placement 
and foIIow-up records are main tatned 
and are reviewed periodically for the 
differential Impacts of the Instruc- 
tional, counseling and guidance, and 
placement programs on females and males 
who leave or complete school. 

Another major policy to be considered Is 
whether guidance counseling staff are pro* 
vided with the In-service training neces- 
sary for sex-f^r^ courtsel Ing and guidance 
for students .^^pKi^ Sex Equality, in Guidance 
Opportunities \SEGO) materials, are avai lable 
for in-service training (AP6A,4)*^ Thus, 
there are a series of policies that can' 
result in sex-fair counseling in guidance ^ 

1^ for students at all tevels of the educa- 
tional system.^ These Include policies on 
the development and use of lists about the 
sex-fa i mess' of interest inventories that 
are available for both counselors and 
students, developing checkl Ists for coun- 
seling services more broadly viewed, and 
developing or insuring that staff partici- 
pate in in-service worl<shops designed to 
assist them to provide_jsex-fai r counseling 

y and use of^ interest Inventories* A related 
type Qf educational test, the aptitude 
test, has many of the same issues and 
policies as described here for the fnterest 
measures. 
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APTITUDE MEASURES AND SEX BIAS 



OVERVIEW 

The Importance of examining issues of sex 
bias In aptitude terts cannot be over- 
emi;|haslzed. Aptitude tests are used in 

/ high schools and in colleges to assist in 
counseling students and selecting students 
for college entry. In addition, aptitude 
tests are often used in the employment 
selection process. One of the best 
known aptitude measures is- the Differential 
Aptitude Tests (DAT) (Bennett, Seashore, 
and Wesman, SK The titles of the tests 
In the DAT suggest the differences between 
educational achievement tests. Interest 

' measures, intelligence measures, and* 
aptitude tests. The DAT has the following 
tests: verbal reasoning, numerical »i I. ity, 
abstract reasonfng, clerical speed amd 

' accuracy, mechanical reasoning^ space 

relations, spelling, and language usage. ^ 
Thus, there is a wider range of tests of 
abilities in an aptitude battery than in an 
intelligence test, and achievement tests 
reflect school curricula. more directly. 

^ The DAT scores ire reported to>tude^^ts on 
each Individual 'test arid the interpretation 
that is given to students is: 

When you took the D^ifferent lal Aptitude 
Tests you were told that their purpose 
to help you understand your abili- 
ties better*-$ee where your strengths 
and weaknesses lie— plan your studies 
and think about your future career- 
This report will tell you how you 
scored and will help you use this 
information as you face the need to 
make many kinds of decisions: 

What courses should \ elect next year? 
What the year after? College or not? 
Business or technical course? More 
\ science and math? How about" languages? 
What career should I consider? How 
can 1 get ready for the careers that 



seem reasonable? Do y abilities Jibe 
^TTtlTmv interests? With my opportu'nF 
ties? (PsychologI cal Corporation, 89 , 

The definition of aptitudes presented to 
counselors and students emphasises that, 
..Slmply-apritude Is the capacity to 
learn. You take aptitude tests fn order 
to be able to make better predictions of 
how you can expect to develop in school 
and in a job." (PsychologTcal Corporation, 
89, p. 1] The emphasis on a measure of 
ability as tell ing 'the individual's capa- 
city for future learn tng and prediction of 
future attainments Is what makes the scores 
important to students as well as those 
working with students. 

The use of apt I tude measures In high school 
decisions for course placement and vocation 
al counseling, and post-secondary admis- 
sions selection and vocational counseling, 
as well as use Jn emptoyce seleyctlon, Js 
bas^d upon the Idea of capacity to learn 
andl the prediction of future success. This 
context for Interpreting aptitude scores is 
often general I zed to the Idea that an 
aptl/tude Is "heredltijry." Althoug the DAT 
Hanuial , (Bennett, Seashore, and Wesman, 6) 
provides one paragraph on the d I satinet Ion 
between aptitude as heredity and aptitude , 
as the result of the Interaction of hered- 
ity and environment, many students, teach- 
ers and counselors probably do not main- 
tain this distinction. Again, as with the 
interest measure, the emphasis on predic- 
tion is less Important in. a time of social 
change. Increasing emphasis should be 
given to assisting both boys and girls to 
explore thei.r^ski 1 1 s and Interests related 
to the domains of occupations and career 
choices. The issues and policy recommenda- 
tions below examine overt sex bias In 
aptitude tests, the same-sex versus com- 
bined-sex Aorms issue, the issue of con- 
struct val id ity arising from sex differ- • 
ences in performance, and briefly, the 
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mission or employment. 



ISSUE: OVERT SEX BIAS 

Two aptitude test batteries were reviewed 
for overt sex bias fn language usage and 
sex-role stereotyped portrayal of females 
and males. The two tests are the Dlffer->^ 
ential Aptitude Tests (OAT) and the Flana- 
gan Aptitude Classification Tests (FACT). 
The review of the DAT test booklets shows 
tht most of the tests are relatively 
"language free/^ consisting primarily of 
numbers and figures. Exceptions are the 
tests of mechanical reasoning and language 
usage. The test of mechanical reasoning 
is explained to students as: 

How easily do you grasp the common 
principles of physics as you see 
them in everyday things about you? 
. How well do you understand the laws 
governing simple appliances , machin- 
ery, tools, and motions? 

Students who 4o well ort the mechanical 
reasoning test usually I ike to find 
out how things work. They often are 
better than average at learning how\ 
' to construct, operate, or repair 
complicated equipemnt. ..People who 
do poorly on this test may find the • 
work rather 'hard or uninteresting in 
physical sciences and in those shop 
courses which demand thinking and 
planning, rather than just skill and 

using one's hands... Boys score 
considerably higher than girls on 
the MR and SR tests. Therefore a 
girl who does quite well on these 
tests', as compared with the average 
girl, may still be far below the 
average boy. A girl interested in 
mechanical or engineering work should 
ask her counselor to figure her MR 
and SR percentiles in comparison with 
boys as well as girls. (Psychological 
Corporations, 89, p. ^) ' 

The MR test has illustrations for each test 
item and the illustrations make very clear 
that this is a "man's world." The items 
(Form S) show jockeys, -a man and a. pulley, 
astronauts, men with mirrors, a man throw- 
ing a' ban, a man with a horse and wagon, 



man and a pulley, and two men with hoses. 
The Issue of interpretation and meaning of 
the mechanical reasoning test will be 
discussed below. The only woman portrayed 
In the entire MR test is a woman in a 
wheelchai r* 

The DAT Manual CBennett, Seashore, and 
Wesman, 6) very consistently refers to 
counselors, students, and pupils as "he.'* 
Secretaries are referred to as "she,^' and 
a majority of the examples use males to 
illustrate use of the DAT scares. (This 
\s not true in the^'tasebook, Counsel ing 
from Profiles CPsycho logical Corporation, 
1977), where more girls than hoys are used 
to Illustrate counsel rr>g wfth the test.) 

The second aptitude test reviewed has 
simi lar examples of overt sexism in lan- 
guage usage and Illustrations. The FACT 
test booklet covers show t^ft^ males and 
three females, apparently in sex-stereo- 
typed occupations. The women are portrayed 
as a stenographer, a switchboard operator, 
and a filing clerk. The males are apparent 
ly a scientist, a tailor, a butcher,, and a 
lectureV. There is a context for the pro- 
cess being measured (e.g., Ed and Jack are 
working, the foreman is a he, and so on)*. 
Similar usage appears in the test of expres 
sion* In the test- of ingenuity, women 
serve as a hostess for a children's party 
and to bake a cake. The men are portrayed 
in occupations. 

//^ . . ^ 

The counselor's booklet for FACT (1953) 
provides case studies or examples that are 
all male, with the exception of one femiale 
whose occupation may be that of a nurse « 
In summary, the FACX has overt sex bias 
In both language usage and Illustrations, 
as does the DAT. 

Policy .r-*^ 

The policy of test users with these apti- 
tude tests should be the same as discussed 
earlier with the educational achievement 
tests and interest measures. Any aptitude 
test should be reviewed for overt sex bias. 
If sex jbias is found and there is an alter- 
native^tbe test should not be used. If 
a test containing overt sex bias is to be 
used, a policy to ensure sex fairness 
should be developed. One of the main 
uses of aptitude measures is in counseling, 
and* counselors should refer to the set of 
recommendations d i scussed , under interest 
measures. Materials* and specific coun- 
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^that all occupations are open to qualified 
persons of either sex, and ^n emphasis 

oShouId be placed on developing exploratory 
activities so that students may check 
whether aptitudes reflect interest ^nd 
experience. Test results should never 
be reported to students without counseling 
or Interpretive materials that discuss 
the possible effects of experience — partic- 
ulary sex-linked experiences — on aptitude 
measures. 



sented for grades o through 12, fall 
and spring norms. There are consistent 
sex differences in averagfe scores of the 
clerical y spelling, language usage, 
and mechanical reasoning test,* with girls 
obtaining higher scores on the first 
three, and boys higher on the latter. 
Table 10 provides a surtroary of the differ- 
ences In raw score equivalents of the 50th 
percentile on the boys' and and girls' 
norms for four DAT tests. 



ISSUE: SEX DIFFERENCES IN PERFORMANCE 
AND THEIR INTEnPRETATION 

For a variety of reasons discussed 
earlier, sex differences appear on some 
tests in aptitude batteries. For the 
FACT, the Technical Suppleynent (FACT, 
195^) states that girls obtain higher 
scores on the expression, coding, and 
tables^ tests, and that boys obtain 
higher scores on mechanics, assembly, 
scales, afid patterns. However, there 
are no separate sex norms, and the 
interpretive materials for students do not 
discuss any possible sex differences, 

\. - 

In the Differential Apt i tude. Test , stu- 
dents check off their sex on the answer 



As shown in the Table 10, there are few 
sex differences in verbal reasoning, 
numerical reasoning, abstract reasoning^ 
and space relations for raw score 
equivalents of the 50th percentile across 
grades 8 through 12. The differences 
shown for space relations are in the usual 
direction, with males scoring a few raw 
score points higher in the later high 
school years. 



The item selection process is not 
well descr^^ed for the DAT: Diffi- 
culty and qiscrimination values were 
computed^tor each item, by grade and 
sex, and these statistics were used 
as an aid in selecting the cootent 
of the final forms of the new test. 
(Bennett, Seashore, and Wesman, 6, 
p. 10.) 



Table 10 



Similarity of Raw Scbre Equivalents of the 50th Percentile 
for Females and Malerf on Four DAT Tests 



Test 



Grade^ Level 



Total No. of 
items in Test 



Verbal Reasoning 



10 



1 1 



12 



50 



Numerical Ability 1 



1»0 



Abstract Reasoning 



50 



Space Relations 



2 



B 



B 



60 



equal siign (=) indicates raw score equivalent of 50th percentile 
is the same for boys and girls 

each raw score ,d i fference was for girls (G) or boys (B) having 
a higher raw score, thus requiring a higher raw score for the 
^ame percentile rank as on the other-sex nor^ms. 



41 



t 



on the subtests In TablelOmay be the 
result of a deliberate effort to provide 
s4mMar distributions of ability for 
feonales and males (as Is done on many / 
Intelligence tests), • The differences here 
are very small compared to the stereotype 
of differences In female and male perform- 
ance that abounds In the psychological 
and popular literature for these abilities. 

The DAT reports separate sex norms for 
every test and battery. The FACT men- 
tions sex differences In performance but 
does not provide sex-different la ted norms. 
The issue of concern here Is which sets 
of norms should be reported to students 
and how the Interpretive materials for' 
students and counselors should describe 
the norms. 

Policy 

The difficulty of reporting combined-sex 
(i;e., one. set of norms based on both 
boys and girts) is illustrated by an ex- 
ample given in the DAT Manual: 



The existence of such sex differences in 
performance suggests that the policy being 
followed with interest tests is also 
appropriate for aptitude measures. Both 
sets of norms should be reported to both 
females and males » and the existence of 
the differences In the percentile ranks 
obtained on some tests should be used 
in counseling and when using published 
interpr^ctive materials to explore sex- 
related ckifferences in experiences and 
actlvities^fetiat might lead to test score 



for example, the current Interpretation 
for girls Is that: 

,.,ln counseling a girl who wishes 
to enter a mechanical field, her raw 
score on Mechanical Reasoning should 
be compared with the boy's norms for 
the same grade level. To Indicate 
real promise in this field, she 
would need to score very high In 
relation toother girls. Assuming 
that she was tested in Grade 10, in 
the fall semester, she would need to 
be at the 95th percentile for girls 
in prder ,to equal the 75th percentile 
for boys* Ue may anticipate, how- 
ever, that with the present-day trend 
of girls showing interest in subjects 
and fields previously associated 
mainly with boys, the difference In 
scores obtained by the two sexes 
will gradual ly diminish, (Bennett, 
Seashore, and Wesman, 6, p. 46,) 

Alternatively, Cronbach C15) has offered 
another approach in reviewing the suit- 
ability iff the Armed Services Vocational 
Aptitude Battery CaSVAB 77) for use with 
'high school students in the spring of 
1978. In addition to technical difficul- 
ties with the tests, Cronbach emphasized 
the sex-linked shop-experience factor 
of some of the tests and recommended that 
the ASVAB not be us^unless another 
battery is used and>^ there are coun- 
selors to help interpret the ASVAB* re-/ 
su 1 . 

^Although the DAT tests are not as obvious- 
ly based on shop-experience as the ASVAB, ^ 
the test of mechanical reasoning is un- 
likely to provide as valid a measure of 
mechanical aptitude (or the capacity to 
learn) for girls. There are twice as 
many girls as boys in the norm groups 
who scored at the chance level in the 
test (10 percent of the girls obtained 
scores of 23 or less on the 3-al ternative 
70-item test, as compared with 5 percent of 
the boys in grade 8; in grades 9 through 
12, the percent of boys at the chance level 
is from 3 percent to I percent, and for 
girls the pei*cent drops to 5 percent 
and stays there for all four years). 
Some of the situations illustrated in the 
test Items need to be examined on an 
individual basis to see if the sex differ- 
ences in norms can be reduced and whether 
items baseci on illustrations of situations 
equally familiar to boys and girls can be 



...the case of a boy, tested in the 
fall of tenth grade, whose raw score 
is kS on the Mechanical Reasoning . 
If boys' and girls' scores were 
reported in a single distribution, 
a raw score of ^8 would place the boy 
at the 66th percentile. On such 
evidence, a counselor might be in- 
clined to suggest that the boy con- 
sider one of the careers requiring 
mechanical intelligence (sic). This 
course of action would be misleading 
sinte the tejith grade norms based 
on the boys alone show that a raw 
score of ^8 falls at the 50th per- 
centile. The boy is then seen as 
not at all superior in this ability 
>when compared with other boys at 
his grade level. (Bennettj^ Seashore, 
and Wesman, S, p. 46.) 



fully In the Item pools for one of Inter- 
est measures) . 

For both the DAT and the FACT, as well as 
other aptitude tests, publishers should 
make available both same-sex and opposite- 
sex norms to counselors and students. 
Interpretive materials should discuss the 
effects of sex-stereotyping on experiences 
and thter possible relationships to test 
scores on the aptitude battery. Addition- 
ally, publishers and counselors may desire 
to try to quantify the students' estimates 
of experiences that may relate to their 
test score, interpretation. This approach 
may be^ mo St appropriate for the tests In 
• the DAT, for example, that have sex differ- 
ences in performance: the clerical , 
spelling, language usage, and mechanical 
reasoning tests. Similarly for the FACT, 
the sex differences appeared on the 
expression, coding, tables, mechanics, 
assembly, scales, and patterns tests. 
Students might rate each test item Cor 
the overaTl test) for the number of 
activities or hobbies they have partici- 
pated in that are related to the item 
or test (none, a few, about average, 
more than most girls my age), number of 
school courses related to the item (test), 
and their expectation for performance on 
the test compared to other boys or girls 
of their age (top 25 percent, next 25 ' \ , 
perccint, and so on). These ratings can 
be combined into an index of test-related 
experiences and indicate that if there 
were few experiences related to the test, 
Individuals might want to interpret 
their scores cautiously and seek further 
experience to test their own skills and 
interests. ^ 

Until such time as the publisher provides 
Item analysis data in the test manual 
and evidence of the construct validity 
of the mechanical reasoning test for 
\ girls, this test should not be considered 

sex-fair fo^ girls. Both sex norms should 
be shown to girls, possible reasons for 
differences explored, and care g iv^]i*^^G^ 
encourage girls to try activities that 
may check their interests and enlarge 
their experiences related to the occupa- 
< tions suggested as related to success on 
this measure. And , the test should not 
be used to retain the unequal enrollment 
of boys and girls in vocational education 
courses (Roby, 92). Girls should be en- 
couraged to consider "nonK^radi t ionaT' 
educational experiences as'a prelude to 



ERIC 



Sex- role related nmi tat ions of experiences 
and Interests should not limit women^s 
educational and occupational options. 



ISSUE: ADVERSE IMPACT OF SELECTKXi TESTS 



As mentioned earlier, issues of sex bias 
also arise In the use of aptitude tests 
in selecting students (or counseling or 
otherwise encouraging their enrollment) 
for vocational education courses and 
college admissions. These situations are 
similar to the use of tests in employment 
selection settings and appear subject to 
the same regulations, the Uniform Guide- 
lines on Employee Selection Procedures 
( Federal Register , December 20, 1977)* 
The Uni form Guidel ines go farther than 
earlier regulations in Title IX in defining 
adverse impact and setting standards for 
determining the fairness of tests. Title 
rx, as mentioned earlier, provides regula- 
tions on the use of tests in college ad- 
missions such that 'Inhere there is a sub* 
stantlal disproportion of^^members of one 
sex in any partl^cular course or admitted 
group, the school or college had to pro- 
vide assurance that the disproportion was 
not the result of discrimination in the 
instrument or the applications of the 
instrument (I.e., a fair test and fair 
use of the test) . 

In the Uni from Guidlines adverse impact is 
defined as a selection rate for any racial, 
ethnic or sex group that is less tjvan four-, 
fifths^ (80 percent) of the rate for the 
group with the highest rate. If 100 boys 
apply for a mechanics program and 20 girls 
apply, then if 50 boys are accepted and 
no girls, these selection rates might 
constitute evf^ence of adverse Impact (the 
selection rate for boys was 50 percent, 
1 in 2, and for girls, 0. 0 In 20 selected). 
If at least 8 girls had been accepted, then 
the selection rate for girls would have 
been four-fifths .that of the boys, and this 
particular def init ior^ of adverse impact : 
vould be satisfied. 

Whatever the basis for selection in the 
example above, whether a test, informal 
interview, school record, or other measure, 
the assessment* procedure must be examined 
for bias and sex-fair use in order to 



Earlier discussions have recommended that 
procedures to establish a sex-fair test be 
def Inedy and at minimum they include 
examining the test for overt sex bias 
In terms of language usage» sterotyped 
portrayals of females and males, and pos- 
itive representation of women; for Item 
bias by providing empirical data based on 
one of the Item analysis procedures 
proposed to date; and, where sex differ- 
ences in average performance remain, 
for test bias by providing further evidence 
of the construct val idity of the test. 
This series of reviews, judgments, and 
examination of empirical data wi'l 1 permit 
a test user to decide it the test is fair 
to a group of women in a particular 
application of the test and then examine 
the use of the test for sex fairness. 



Fair test use has been the subject of 
considerable theoretical work in a limited, 
psychometric framework (see the VoK J3, 
No. 1, 1976 issue of Journal bf Educational 
Measurement devoted to test bias; also 
Linn, 557"and Ncivick and Elljs, 79). The 
psychometric approach has been to examine 
the predictive validity of the test for 
different groups. Some of the work has 
been dohe with college admissions tests 
such as the Scholastic Aptitude Test 
(Cleary, 10', for example} and other work 
has been in the employment selection 
setting (Einhorn and Bass, 27). However, 
in both selection settings the focus has 
been to examine the relationship between 
the test and a criterion of success--such 
as first-year college grades or successful 
performance on a job. Wild and Dwyer (117) 
have recommended that both the criterion 
and the predictor (test) be examined for 
their relationship to educational goals 
and Intermediate crlteri^uch as grades 
(or in employement, satsj^actory perfor- 
mance on the job), as well as for their ^ 
reliability for both females and males, /-afnd 
for sex bias. v 

The examination of test bias in selection 
using pVedictive validity has compared the 
valfdl ty coefficients cf various propor- 
tions of groups selected under different 
assumpti|6ns. There appear to be two 
1 imi tat 'rons to this conception of test 
bias. From one perspective, the test 
bias models so far have relied on g-roup 
categories, such as racial/ethnic groups 
or sex, and examined whether the test func- 
tioned in a fair manner for all the indi- 



viouais in a group. From another^4?erspec- 
tive, the definition of test bias has 
relied heavily on predictive validity to 
examine test fairness and not placed 
enough emphasis on contept and construct 
val Idlty, 

V 

Criticisms of the test bias models usKng 
group analyses have suggested that we aVe 
typically concerned with p'rovlding equ^l 
opportunity to individuals, for education 
or employment (Novick and Ellis, 79). |n 
providing special admisslonsfor minority 
groups and women to law or medical schools, 
we are trylntg to compensate for earlier 
disadvantage based on discrimination be- 
cause of race or^'sex. The problem, 
according to Novick and Ellis, Is that 
the disadvantage based on discrimination 
does not occur equally to all individuals 
within the race or sex group. The purpose* 
of the compensatory policy may not be met 
unless there is some way of attaching an 
index of **educationa] or equal opportunity 
disadvantage" based on r^ace, sex, social 
class, and so on, to each inWividual'con- 
sidered for hiring or admissions. This 
argument adds to the predictive model an 
explicit utility or value to hiring or 
enrolling Individuals who are disadvantaged 
based on prior discrimination. Test scores 
are still used, with probabilities or' 
estimates of successful performance in 
school, but ^te adjusted by the values 
attached to providing an education or job 
to individuals who have been discriminated 
against and thus denied equa.l opportunity 
to the, point of the decision being made. 
This type of procedure has been formally 
used before in providing bonus points to 
veterans In civil service employment 
(points are added- just for being a veteran,' 
on top of whatever test scores or creden- 
tial evaluations are made). Similarly, 
the system has operated Informally in 
college admissions where preference is V 
given to admitting sons and .daughters of , \ 
alumni and to those who may provide sub- • 
stantial funds toa College or to those ^ 
who are from different geographic regions 
of the country or who have special skills 
(artists or athletes, for example). 

The use of race or ethtiicity as a selection 
variable to a professional school was 
examined by the U.S. Sulpreme Court in the 
Bakke case for medical school adml ss ion . 
That -decision apparently accepts the use 
of race or other indices of disadvantage 
so long as there is no specific number of 
admissions alWcated to one group or y^n- 
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to change the narrow psycnometnc preojc- 
tlon model to a model that Includes ^ 
both probabilities of success (prediction), 
and utH l.tics. The prediction problem 
does not affect women as much as minorities. 
The ''adverse Impact'' criterion of sex bias 
makes clear that the problem for women is 
encburaging them to enter the applicant 
pool • 

However, from the sex bias perspective and 
examining educational tests, there is most 
cause for concern with tests of mathematics, 
and Stpatial areas, since sex differences 
In average performance may be found In 
these tests. As mentioned earlier, there 
may be some merit, to trying to quantify 
an index of ''mechanical experiences" 
(disadvantage based on sex discrimination 
in the Novick and Ellis sense) or "math- 
ematics experiences'\or "spatial exper- 
iences" to assist inli nterpreting test 
scores for Ind IviduaPwomen and for use on 
an experimental basis in selecting women 
for schools or fields of study that are 
traditionally male dominated, e.g., auto 
mechanics, engineering, architecture, 
physics, and so on. This approach would 
be consonant with the second criticism ' 
of the work* in test bias in selection to 
date, the criticism that predictive 
validity has been too much the focus 
and the concept of validity needs to be 
broader and emphasize construct validity 
.(Manning, 69; Hessick, 75). 

Manning (69) has examined recent court 
decisions related to test bias for the type 
of validity permitted In the legal judgments. 
Although professional test standards have 
emphasized predictive validity, the courts 
have adopted a legal standard that evidence 
of content or construct validity is also 
appropriate. And Manning -suggests that the 
logical, hypothetical, and deductive pro- 
cesses of scientific inference needed for 
content and construct validity provide a 
necessary balance to predictive validity. 
The Uniform Guidel Ines provide th^ame 
view, and give definitions of both types 
of validity: content val id I ty Is demon- 
strated by data showing that a selection 
procedure a representative sample of 
important work behaviors to be -performed 
on the job; construct val Idity Is demon- 
strated by data showing that the selection 
procedure measures the degree to which ^ 
candidates have identifiable characteris- 
tics. that have been determined to be 
important for successful job performance. 
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use exists, schools and employers cannot 
rely on predictive validity coefficients 
as reported In the manual for tests such 
as the DAT. In selecting or r^ommendtng 
students for advanced science courses, 
for vocational courses, for professional 
schools of law and engineering, based on 
tests, the test user will need to examine 
the sex. fairness of the test, Including 
Its content and construct validity, and 
the sex fairness of Its uses In selection 
or counseling. Th^re are policy Implica- 
tions of this issue. 

Policy 

The Issue of sex-fair aptitude tests and 
fair use of these tests has Implications 
for school and employer policy In establish- 
ing review procedures for tests and for ♦ 
fnaintaining test data. The review pro- 
cedures for aptitude tests are similar to • 
those for achieveniwit and Interest measures 
and consist of forming groups to review the 
test items and accompanying manuals and 
student Interpretive materials from the 
standpoint of sex fairness for the partic- 
ular group of women who will take the tests. 
Judgments based on the review of test mate- 
rials and test data, either collected by 
the publisher or Jfcally, are part of the 
set of procedures'^ determine sex fair- 
ness. Where sex ^wferences remain, and 
where the test iM^ed in selection, 
further analysis or the job or performance 
related characteristics of the test (content 
and construct validity) is required. Part 
11 1 of the Uniform Guidel ines (1977) pro- 
vides a detailed listing of all the descrip- 
tive information and data to be given for 
each type of validity. 

The test user should maintain record that 
provide Information on the applicant groups, 
and those finally selected, by sex and 
racial/ethnic groups designated as blacky 
American Indian, Asian, Hispanic, and white 
(Caucasian other than Hispanic). These 
classifications are also those used for 
developing and monitoring affirmative 
action programs and could also be used for 
the new VEA requ I rerpents to monitor and 
affirm equal access to vocational educa- 
tional programs. 

Other policies, such as those described 
with the use of interest measures and 
'sex-fair counseling, will require addition- 
al data and records and may be appropriate 
for aptitude measures also. 



CONCLUDING REMARKS 



^ This paper on equity for wcxnen 
\ in educational testing has focused on 
Issues pf sex bias In educational achieve- 
ment tests, career Interest inventories, 
and aptitude tests. Where each issue 
was raised and analyzed for implications 
of sex bias, policies have been suggested 
that are within the authority and respon- 
sibility of local and state educational 
administrators, teachers, counselors, 
parents, and even students - 

One outcome of the review is the conclusion 
that sex bias or sex-fairness does not 
exist as a general quality of any educa- 
tional test, any more than test relia- 
bility or validity exist* as a general 
quality of the test. Hence there is a 
need fof an understanding at the policy- 
making level in education that there will 
need to be a series of procedures, check- 
lists, and operational policies that will 
consider sex bias and sex-fairness in state 
and local use of educational tests. Policy 
IS needed to establish the use of the 
appropriate set of procedures in each 
testing sftudtfon^ 

This view recognizes that there is no one _^ 
estimate or figure that will say that a 
test is sex-fair. Rather, there arc In 
the process of development what will be- 
come commonly accepted coisies or sets of 
rules that, when applied t>the evaluation 
of the sex-fairness of a given educational 
test in a particular context, will permit 
the determlnatiorf whether the test and Its 
use win be se^falr. Evidence for the 
codification or the set of rules are in the 
proposed common regulations of the'CJvIl 
Service Commission, the Equal Opportunity 
Commission, the Department of Justice, 
and the Department of Labor--the Uniform 
Guidelines on Employee Selection Procedures 
( Federal Register , December 20, 1977) and 
in the NIE Guidelines for the Assessment 



of Sex Bias and Sex-Fairne ss In Career* 
Interest Inventories . These MjilJjlSl 
more explicit with respect to test bias 
and sex-fairness than the current profes- 
sional standards as codified in the 
Standards for Educational- and Psych- , 
o loqical Tests (American Psychological 
Association, 101). It Is likely that ths - 
Standards will be brought more In line 
with federal regulations and the NIE Gufde- 
1 ines when they are revised in the near 
future. 

Distinctions^ have been made between the set 
of procedures (reviews, data analyse?, and 
judgments) that will assist In the evalua- 
tion of whether a test is sex-fair for a 
particular group of women or gIrU in a 
particular setting, and whether there is 
sex-fair use of the test. In the first 
Instance the test will be reviewed by the 
publisher in the test development process 
and by local committees for overt sex bias 
in language usage, stereotyped portrayal of 
men and women, and positive representation 
of women In all areas of culture. Similar 
reviews will examine the test adijlnlstrat ion 
manual, technical manual, and all inter- 
^ prctlve materials provided to test adminls- 
^ trators, counselors, teachers, and students. 
Other evidence of sex-fairness of the test 
will be In the empirical data presented. 
Including the statistical analyses of items 
for bias, decision rules for selecting 
Items, and the average performance of fe- 
males and males on the test. Where sex 
differences in performance are found, 
evidence of the construct validity of 
the test for women Is required, issues 
concerning the types of norms that are 
required when mean sex differences occur 
were discussed and policy recommendations . 
^ made. ' >* 
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use of achteveiMht test scores, career** 
Interest InVedforles, and aptitude tests 
were discussed. Of particular concern 
are course enrol 1ment» vocational prepara- 
tion » college major, and labor force 
partlcipatloh tfata showing girls ^and wocnen 
In sex-segregated areas— under-represented 
In crafts and trades, mathematics, and 
sciences, and In major professions such as 
. business, law, and medicine* Procedures 
and policies are recommended to encourage 
women to enter non-trad I tlonal areas and to 
ensure that educational testing does not 
contribute to Inequity In educational 
opportunities* 

The sex*-falr use of aptitude tests In 
course, program, and employment selection 
was also examined. A changing emphasis 
^ from, predictive validity to requiring job- 
and performance-related analyses of tests 
and criteria is seen as a positive move 
away from a narrow perspective of test 
bias. Logical Judgments and careful 
development of the rationale for any use 
of aptitude tests where adverse impact 
Is found should permit a better analyses 
of the role of past experience and sex- 
role expectations and stereotypes for any 
possible effect on the observed performance 
of girls and women on educational tests.. 

While the emphasis In the review has been 
on issues and policy In relation to sex ' 
bias and educational tests, |he sets of 
procedures suggested for local review and 
data analysis wIlT contribute to the 
better use of educational tests in general. 
Too often tests are used and scores ob- 
tained and interpreted without a conscious*^ 
ness of the individual Items In the test 
and an examination of student performance 
at the Item IcveK This focus, It Is 
argued here, will contribute to the Integra- 
tion of tests with curriculum and Instruc- 
tion on a broader basis. In this sense, 
'considerations of 'educat lonal equity for 
women contribute to Improved education, 
a worthy goal for all concerned with the 
educat iona 1 process . 
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